1. 03 Aug, 2018 22 commits
    • Stephen Hemminger's avatar
      hv_netvsc: fix network namespace issues with VF support · 0a84c912
      Stephen Hemminger authored
      [ Upstream commit 7bf7bb37 ]
      
      When finding the parent netvsc device, the search needs to be across
      all netvsc device instances (independent of network namespace).
      
      Find parent device of VF using upper_dev_get routine which
      searches only adjacent list.
      
      Fixes: e8ff40d4 ("hv_netvsc: improve VF device matching")
      Signed-off-by: default avatarStephen Hemminger <sthemmin@microsoft.com>
      
      netns aware byref
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a84c912
    • Juergen Gross's avatar
      xen/netfront: raise max number of slots in xennet_get_responses() · 51b69407
      Juergen Gross authored
      [ Upstream commit 57f230ab ]
      
      The max number of slots used in xennet_get_responses() is set to
      MAX_SKB_FRAGS + (rx->status <= RX_COPY_THRESHOLD).
      
      In old kernel-xen MAX_SKB_FRAGS was 18, while nowadays it is 17. This
      difference is resulting in frequent messages "too many slots" and a
      reduced network throughput for some workloads (factor 10 below that of
      a kernel-xen based guest).
      
      Replacing MAX_SKB_FRAGS by XEN_NETIF_NR_SLOTS_MIN for calculation of
      the max number of slots to use solves that problem (tests showed no
      more messages "too many slots" and throughput was as high as with the
      kernel-xen based guest system).
      
      Replace MAX_SKB_FRAGS-2 by XEN_NETIF_NR_SLOTS_MIN-1 in
      netfront_tx_slot_available() for making it clearer what is really being
      tested without actually modifying the tested value.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51b69407
    • Mark Rutland's avatar
      kcov: ensure irq code sees a valid area · a45f5ee6
      Mark Rutland authored
      [ Upstream commit c9484b98 ]
      
      Patch series "kcov: fix unexpected faults".
      
      These patches fix a few issues where KCOV code could trigger recursive
      faults, discovered while debugging a patch enabling KCOV for arch/arm:
      
      * On CONFIG_PREEMPT kernels, there's a small race window where
        __sanitizer_cov_trace_pc() can see a bogus kcov_area.
      
      * Lazy faulting of the vmalloc area can cause mutual recursion between
        fault handling code and __sanitizer_cov_trace_pc().
      
      * During the context switch, switching the mm can cause the kcov_area to
        be transiently unmapped.
      
      These are prerequisites for enabling KCOV on arm, but the issues
      themsevles are generic -- we just happen to avoid them by chance rather
      than design on x86-64 and arm64.
      
      This patch (of 3):
      
      For kernels built with CONFIG_PREEMPT, some C code may execute before or
      after the interrupt handler, while the hardirq count is zero.  In these
      cases, in_task() can return true.
      
      A task can be interrupted in the middle of a KCOV_DISABLE ioctl while it
      resets the task's kcov data via kcov_task_init().  Instrumented code
      executed during this period will call __sanitizer_cov_trace_pc(), and as
      in_task() returns true, will inspect t->kcov_mode before trying to write
      to t->kcov_area.
      
      In kcov_init_task() we update t->kcov_{mode,area,size} with plain stores,
      which may be re-ordered, torn, etc.  Thus __sanitizer_cov_trace_pc() may
      see bogus values for any of these fields, and may attempt to write to
      memory which is not mapped.
      
      Let's avoid this by using WRITE_ONCE() to set t->kcov_mode, with a
      barrier() to ensure this is ordered before we clear t->kov_{area,size}.
      This ensures that any code execute while kcov_init_task() is preempted
      will either see valid values for t->kcov_{area,size}, or will see that
      t->kcov_mode is KCOV_MODE_DISABLED, and bail out without touching
      t->kcov_area.
      
      Link: http://lkml.kernel.org/r/20180504135535.53744-2-mark.rutland@arm.comSigned-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a45f5ee6
    • Petr Machata's avatar
      mlxsw: spectrum_switchdev: Fix port_vlan refcounting · 73990abb
      Petr Machata authored
      [ Upstream commit 9e25826f ]
      
      Switchdev notifications for addition of SWITCHDEV_OBJ_ID_PORT_VLAN are
      distributed not only on clean addition, but also when flags on an
      existing VLAN are changed. mlxsw_sp_bridge_port_vlan_add() calls
      mlxsw_sp_port_vlan_get() to get at the port_vlan in question, which
      implicitly references the object. This then leads to discrepancies in
      reference counting when the VLAN is removed. spectrum.c warns about the
      problem when the module is removed:
      
      [13578.493090] WARNING: CPU: 0 PID: 2454 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2973 mlxsw_sp_port_remove+0xfd/0x110 [mlxsw_spectrum]
      [...]
      [13578.627106] Call Trace:
      [13578.629617]  mlxsw_sp_fini+0x2a/0xe0 [mlxsw_spectrum]
      [13578.634748]  mlxsw_core_bus_device_unregister+0x3e/0x130 [mlxsw_core]
      [13578.641290]  mlxsw_pci_remove+0x13/0x40 [mlxsw_pci]
      [13578.646238]  pci_device_remove+0x31/0xb0
      [13578.650244]  device_release_driver_internal+0x14f/0x220
      [13578.655562]  driver_detach+0x32/0x70
      [13578.659183]  bus_remove_driver+0x47/0xa0
      [13578.663134]  pci_unregister_driver+0x1e/0x80
      [13578.667486]  mlxsw_sp_module_exit+0xc/0x3fa [mlxsw_spectrum]
      [13578.673207]  __x64_sys_delete_module+0x13b/0x1e0
      [13578.677888]  ? exit_to_usermode_loop+0x78/0x80
      [13578.682374]  do_syscall_64+0x39/0xe0
      [13578.685976]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fix by putting the port_vlan when mlxsw_sp_port_vlan_bridge_join()
      determines it's a flag-only change.
      
      Fixes: b3529af6 ("spectrum: Reference count VLAN entries")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      73990abb
    • Johannes Weiner's avatar
      arm64: fix vmemmap BUILD_BUG_ON() triggering on !vmemmap setups · c1550e01
      Johannes Weiner authored
      commit 7b0eb6b4 upstream.
      
      Arnd reports the following arm64 randconfig build error with the PSI
      patches that add another page flag:
      
        /git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init':
        /git/arm-soc/include/linux/compiler.h:357:38: error: call to
        '__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON
        failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT)
      
      The additional page flag causes other information stored in
      page->flags to get bumped into their own struct page member:
      
        #if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <=
        BITS_PER_LONG - NR_PAGEFLAGS
        #define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT
        #else
        #define LAST_CPUPID_WIDTH 0
        #endif
      
        #if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0
        #define LAST_CPUPID_NOT_IN_PAGE_FLAGS
        #endif
      
      which in turn causes the struct page size to exceed the size set in
      STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the
      VMEMMAP page array according to address space and struct page size.
      
      However, the check is performed - and triggers here - on a !VMEMMAP
      config, which consumes an additional 22 page bits for the sparse
      section id. When VMEMMAP is enabled, those bits are returned, cpupid
      doesn't need its own member, and the page passes the VMEMMAP check.
      
      Restrict that check to the situation it was meant to check: that we
      are sizing the VMEMMAP page array correctly.
      
      Says Arnd:
      
          Further experiments show that the build error already existed before,
          but was only triggered with larger values of CONFIG_NR_CPU and/or
          CONFIG_NODES_SHIFT that might be used in actual configurations but
          not in randconfig builds.
      
          With longer CPU and node masks, I could recreate the problem with
          kernels as old as linux-4.7 when arm64 NUMA support got added.
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Tested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: stable@vger.kernel.org
      Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms.")
      Fixes: 3e1907d5 ("arm64: mm: move vmemmap region right below the linear region")
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1550e01
    • Steven Rostedt (VMware)'s avatar
      tracing: Quiet gcc warning about maybe unused link variable · 4681e882
      Steven Rostedt (VMware) authored
      commit 2519c1bb upstream.
      
      Commit 57ea2a34 ("tracing/kprobes: Fix trace_probe flags on
      enable_trace_kprobe() failure") added an if statement that depends on another
      if statement that gcc doesn't see will initialize the "link" variable and
      gives the warning:
      
       "warning: 'link' may be used uninitialized in this function"
      
      It is really a false positive, but to quiet the warning, and also to make
      sure that it never actually is used uninitialized, initialize the "link"
      variable to NULL and add an if (!WARN_ON_ONCE(!link)) where the compiler
      thinks it could be used uninitialized.
      
      Cc: stable@vger.kernel.org
      Fixes: 57ea2a34 ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4681e882
    • Artem Savkov's avatar
      tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure · 86428ec1
      Artem Savkov authored
      commit 57ea2a34 upstream.
      
      If enable_trace_kprobe fails to enable the probe in enable_k(ret)probe
      it returns an error, but does not unset the tp flags it set previously.
      This results in a probe being considered enabled and failures like being
      unable to remove the probe through kprobe_events file since probes_open()
      expects every probe to be disabled.
      
      Link: http://lkml.kernel.org/r/20180725102826.8300-1-asavkov@redhat.com
      Link: http://lkml.kernel.org/r/20180725142038.4765-1-asavkov@redhat.com
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 41a7dd42 ("tracing/kprobes: Support ftrace_event_file base multibuffer")
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarArtem Savkov <asavkov@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      86428ec1
    • Snild Dolkow's avatar
      kthread, tracing: Don't expose half-written comm when creating kthreads · f9574568
      Snild Dolkow authored
      commit 3e536e22 upstream.
      
      There is a window for racing when printing directly to task->comm,
      allowing other threads to see a non-terminated string. The vsnprintf
      function fills the buffer, counts the truncated chars, then finally
      writes the \0 at the end.
      
      	creator                     other
      	vsnprintf:
      	  fill (not terminated)
      	  count the rest            trace_sched_waking(p):
      	  ...                         memcpy(comm, p->comm, TASK_COMM_LEN)
      	  write \0
      
      The consequences depend on how 'other' uses the string. In our case,
      it was copied into the tracing system's saved cmdlines, a buffer of
      adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
      
      	crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
      	0xffffffd5b3818640:     "irq/497-pwr_evenkworker/u16:12"
      
      ...and a strcpy out of there would cause stack corruption:
      
      	[224761.522292] Kernel panic - not syncing: stack-protector:
      	    Kernel stack is corrupted in: ffffff9bf9783c78
      
      	crash-arm64> kbt | grep 'comm\|trace_print_context'
      	#6  0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
      	      comm (char [16]) =  "irq/497-pwr_even"
      
      	crash-arm64> rd 0xffffffd4d0e17d14 8
      	ffffffd4d0e17d14:  2f71726900000000 5f7277702d373934   ....irq/497-pwr_
      	ffffffd4d0e17d24:  726f776b6e657665 3a3631752f72656b   evenkworker/u16:
      	ffffffd4d0e17d34:  f9780248ff003231 cede60e0ffffff9b   12..H.x......`..
      	ffffffd4d0e17d44:  cede60c8ffffffd4 00000fffffffffd4   .....`..........
      
      The workaround in e09e2867 (use strlcpy in __trace_find_cmdline) was
      likely needed because of this same bug.
      
      Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
      This way, there won't be a window where comm is not terminated.
      
      Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com
      
      Cc: stable@vger.kernel.org
      Fixes: bc0c38d1 ("ftrace: latency tracer infrastructure")
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSnild Dolkow <snild@sony.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9574568
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix possible double free in event_enable_trigger_func() · 10419b0c
      Steven Rostedt (VMware) authored
      commit 15cc7864 upstream.
      
      There was a case that triggered a double free in event_trigger_callback()
      due to the called reg() function freeing the trigger_data and then it
      getting freed again by the error return by the caller. The solution there
      was to up the trigger_data ref count.
      
      Code inspection found that event_enable_trigger_func() has the same issue,
      but is not as easy to trigger (requires harder to trigger failures). It
      needs to be solved slightly different as it needs more to clean up when the
      reg() function fails.
      
      Link: http://lkml.kernel.org/r/20180725124008.7008e586@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Fixes: 7862ad18 ("tracing: Add 'enable_event' and 'disable_event' event trigger commands")
      Reivewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      10419b0c
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix double free of event_trigger_data · 9158a7de
      Steven Rostedt (VMware) authored
      commit 1863c387 upstream.
      
      Running the following:
      
       # cd /sys/kernel/debug/tracing
       # echo 500000 > buffer_size_kb
      [ Or some other number that takes up most of memory ]
       # echo snapshot > events/sched/sched_switch/trigger
      
      Triggers the following bug:
      
       ------------[ cut here ]------------
       kernel BUG at mm/slub.c:296!
       invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
       CPU: 6 PID: 6878 Comm: bash Not tainted 4.18.0-rc6-test+ #1066
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
       RIP: 0010:kfree+0x16c/0x180
       Code: 05 41 0f b6 72 51 5b 5d 41 5c 4c 89 d7 e9 ac b3 f8 ff 48 89 d9 48 89 da 41 b8 01 00 00 00 5b 5d 41 5c 4c 89 d6 e9 f4 f3 ff ff <0f> 0b 0f 0b 48 8b 3d d9 d8 f9 00 e9 c1 fe ff ff 0f 1f 40 00 0f 1f
       RSP: 0018:ffffb654436d3d88 EFLAGS: 00010246
       RAX: ffff91a9d50f3d80 RBX: ffff91a9d50f3d80 RCX: ffff91a9d50f3d80
       RDX: 00000000000006a4 RSI: ffff91a9de5a60e0 RDI: ffff91a9d9803500
       RBP: ffffffff8d267c80 R08: 00000000000260e0 R09: ffffffff8c1a56be
       R10: fffff0d404543cc0 R11: 0000000000000389 R12: ffffffff8c1a56be
       R13: ffff91a9d9930e18 R14: ffff91a98c0c2890 R15: ffffffff8d267d00
       FS:  00007f363ea64700(0000) GS:ffff91a9de580000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 000055c1cacc8e10 CR3: 00000000d9b46003 CR4: 00000000001606e0
       Call Trace:
        event_trigger_callback+0xee/0x1d0
        event_trigger_write+0xfc/0x1a0
        __vfs_write+0x33/0x190
        ? handle_mm_fault+0x115/0x230
        ? _cond_resched+0x16/0x40
        vfs_write+0xb0/0x190
        ksys_write+0x52/0xc0
        do_syscall_64+0x5a/0x160
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
       RIP: 0033:0x7f363e16ab50
       Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24
       RSP: 002b:00007fff9a4c6378 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
       RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f363e16ab50
       RDX: 0000000000000009 RSI: 000055c1cacc8e10 RDI: 0000000000000001
       RBP: 000055c1cacc8e10 R08: 00007f363e435740 R09: 00007f363ea64700
       R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000009
       R13: 0000000000000001 R14: 00007f363e4345e0 R15: 00007f363e4303c0
       Modules linked in: ip6table_filter ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device i915 snd_pcm snd_timer i2c_i801 snd soundcore i2c_algo_bit drm_kms_helper
      86_pkg_temp_thermal video kvm_intel kvm irqbypass wmi e1000e
       ---[ end trace d301afa879ddfa25 ]---
      
      The cause is because the register_snapshot_trigger() call failed to
      allocate the snapshot buffer, and then called unregister_trigger()
      which freed the data that was passed to it. Then on return to the
      function that called register_snapshot_trigger(), as it sees it
      failed to register, it frees the trigger_data again and causes
      a double free.
      
      By calling event_trigger_init() on the trigger_data (which only ups
      the reference counter for it), and then event_trigger_free() afterward,
      the trigger_data would not get freed by the registering trigger function
      as it would only up and lower the ref count for it. If the register
      trigger function fails, then the event_trigger_free() called after it
      will free the trigger data normally.
      
      Link: http://lkml.kernel.org/r/20180724191331.738eb819@gandalf.local.home
      
      Cc: stable@vger.kerne.org
      Fixes: 93e31ffb ("tracing: Add 'snapshot' event trigger command")
      Reported-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9158a7de
    • Tejun Heo's avatar
      delayacct: fix crash in delayacct_blkio_end() after delayacct init failure · a2f85c02
      Tejun Heo authored
      commit b512719f upstream.
      
      While forking, if delayacct init fails due to memory shortage, it
      continues expecting all delayacct users to check task->delays pointer
      against NULL before dereferencing it, which all of them used to do.
      
      Commit c96f5471 ("delayacct: Account blkio completion on the correct
      task"), while updating delayacct_blkio_end() to take the target task
      instead of always using %current, made the function test NULL on
      %current->delays and then continue to operated on @p->delays.  If
      %current succeeded init while @p didn't, it leads to the following
      crash.
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
       IP: __delayacct_blkio_end+0xc/0x40
       PGD 8000001fd07e1067 P4D 8000001fd07e1067 PUD 1fcffbb067 PMD 0
       Oops: 0000 [#1] SMP PTI
       CPU: 4 PID: 25774 Comm: QIOThread0 Not tainted 4.16.0-9_fbk1_rc2_1180_g6b593215b4d7 #9
       RIP: 0010:__delayacct_blkio_end+0xc/0x40
       Call Trace:
        try_to_wake_up+0x2c0/0x600
        autoremove_wake_function+0xe/0x30
        __wake_up_common+0x74/0x120
        wake_up_page_bit+0x9c/0xe0
        mpage_end_io+0x27/0x70
        blk_update_request+0x78/0x2c0
        scsi_end_request+0x2c/0x1e0
        scsi_io_completion+0x20b/0x5f0
        blk_mq_complete_request+0xa2/0x100
        ata_scsi_qc_complete+0x79/0x400
        ata_qc_complete_multiple+0x86/0xd0
        ahci_handle_port_interrupt+0xc9/0x5c0
        ahci_handle_port_intr+0x54/0xb0
        ahci_single_level_irq_intr+0x3b/0x60
        __handle_irq_event_percpu+0x43/0x190
        handle_irq_event_percpu+0x20/0x50
        handle_irq_event+0x2a/0x50
        handle_edge_irq+0x80/0x1c0
        handle_irq+0xaf/0x120
        do_IRQ+0x41/0xc0
        common_interrupt+0xf/0xf
      
      Fix it by updating delayacct_blkio_end() check @p->delays instead.
      
      Link: http://lkml.kernel.org/r/20180724175542.GP1934745@devbig577.frc2.facebook.com
      Fixes: c96f5471 ("delayacct: Account blkio completion on the correct task")
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarDave Jones <dsj@fb.com>
      Debugged-by: default avatarDave Jones <dsj@fb.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Josh Snyder <joshs@netflix.com>
      Cc: <stable@vger.kernel.org>	[4.15+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a2f85c02
    • Shakeel Butt's avatar
      kvm, mm: account shadow page tables to kmemcg · 8eead4f5
      Shakeel Butt authored
      commit d97e5e61 upstream.
      
      The size of kvm's shadow page tables corresponds to the size of the
      guest virtual machines on the system.  Large VMs can spend a significant
      amount of memory as shadow page tables which can not be left as system
      memory overhead.  So, account shadow page tables to the kmemcg.
      
      [shakeelb@google.com: replace (GFP_KERNEL|__GFP_ACCOUNT) with GFP_KERNEL_ACCOUNT]
        Link: http://lkml.kernel.org/r/20180629140224.205849-1-shakeelb@google.com
      Link: http://lkml.kernel.org/r/20180627181349.149778-1-shakeelb@google.comSigned-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8eead4f5
    • KT Liao's avatar
      Input: elan_i2c - add another ACPI ID for Lenovo Ideapad 330-15AST · ca6427fa
      KT Liao authored
      commit 6f88a643 upstream.
      
      Add ELAN0622 to ACPI mapping table to support Elan touchpad found in
      Ideapad 330-15AST.
      Signed-off-by: default avatarKT Liao <kt.liao@emc.com.tw>
      Reported-by: default avatarAnant Shende <anantshende@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca6427fa
    • Chen-Yu Tsai's avatar
      Input: i8042 - add Lenovo LaVie Z to the i8042 reset list · e0e385e2
      Chen-Yu Tsai authored
      commit 384cf428 upstream.
      
      The Lenovo LaVie Z laptop requires i8042 to be reset in order to
      consistently detect its Elantech touchpad. The nomux and kbdreset
      quirks are not sufficient.
      
      It's possible the other LaVie Z models from NEC require this as well.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChen-Yu Tsai <wens@csie.org>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0e385e2
    • Donald Shanty III's avatar
      Input: elan_i2c - add ACPI ID for lenovo ideapad 330 · b4667635
      Donald Shanty III authored
      commit 938f4500 upstream.
      
      This allows Elan driver to bind to the touchpad found in Lenovo Ideapad 330
      series laptops.
      Signed-off-by: default avatarDonald Shanty III <dshanty@protonmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4667635
    • Marek Szyprowski's avatar
      spi: spi-s3c64xx: Fix system resume support · c09032b7
      Marek Szyprowski authored
      commit e935dba1 upstream.
      
      Since Linux v4.10 release (commit 1d9174fb "PM / Runtime: Defer
      resuming of the device in pm_runtime_force_resume()"),
      pm_runtime_force_resume() function doesn't runtime resume device if it was
      not runtime active before system suspend. Thus, driver should not do any
      register access after pm_runtime_force_resume() without checking the
      runtime status of the device. To fix this issue, simply move
      s3c64xx_spi_hwinit() call to s3c64xx_spi_runtime_resume() to ensure that
      hardware is always properly initialized. This fixes Synchronous external
      abort issue on system suspend/resume cycle on newer Exynos SoCs.
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c09032b7
    • Andrew Morton's avatar
      drivers/infiniband/ulp/srpt/ib_srpt.c: fix build with gcc-4.4.4 · e581f7c5
      Andrew Morton authored
      commit 06892cc1 upstream.
      
      gcc-4.4.4 has issues with initialization of anonymous unions:
      
      drivers/infiniband/ulp/srpt/ib_srpt.c: In function 'srpt_zerolength_write':
      drivers/infiniband/ulp/srpt/ib_srpt.c:854: error: unknown field 'wr_cqe' specified in initializer
      drivers/infiniband/ulp/srpt/ib_srpt.c:854: warning: initialization makes integer from pointer without a cast
      
      Work aound this.
      
      Fixes: 2a78cb4d ("IB/srpt: Fix an out-of-bounds stack access in srpt_zerolength_write()")
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e581f7c5
    • Bart Van Assche's avatar
      IB/srpt: Fix an out-of-bounds stack access in srpt_zerolength_write() · 1e8bb2e9
      Bart Van Assche authored
      commit 2a78cb4d upstream.
      
      Avoid triggering an out-of-bounds stack access by changing the type
      of 'wr' from ib_send_wr into ib_rdma_wr.
      
      This patch fixes the following KASAN bug report:
      
      BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x7a9/0x9a0 [rdma_rxe]
      Read of size 8 at addr ffff880068197a48 by task kworker/2:1/44
      
      Workqueue: ib_cm cm_work_handler [ib_cm]
      Call Trace:
       dump_stack+0x8e/0xcd
       print_address_description+0x6f/0x280
       kasan_report+0x25a/0x380
       __asan_load8+0x54/0x90
       rxe_post_send+0x7a9/0x9a0 [rdma_rxe]
       srpt_zerolength_write+0xf0/0x180 [ib_srpt]
       srpt_cm_rtu_recv+0x68/0x110 [ib_srpt]
       srpt_rdma_cm_handler+0xbb/0x15b [ib_srpt]
       cma_ib_handler+0x1aa/0x4a0 [rdma_cm]
       cm_process_work+0x30/0x100 [ib_cm]
       cm_work_handler+0xa86/0x351b [ib_cm]
       process_one_work+0x475/0x9f0
       worker_thread+0x69/0x690
       kthread+0x1ad/0x1d0
       ret_from_fork+0x3a/0x50
      
      Fixes: aaf45bd8 ("IB/srpt: Detect session shutdown reliably")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e8bb2e9
    • Andrew Morton's avatar
      drivers/infiniband/core/verbs.c: fix build with gcc-4.4.4 · d02c9c8b
      Andrew Morton authored
      commit 6ee68773 upstream.
      
      gcc-4.4.4 has issues with initialization of anonymous unions.
      
      drivers/infiniband/core/verbs.c: In function '__ib_drain_sq':
      drivers/infiniband/core/verbs.c:2204: error: unknown field 'wr_cqe' specified in initializer
      drivers/infiniband/core/verbs.c:2204: warning: initialization makes integer from pointer without a cast
      
      Work around this.
      
      Fixes: a1ae7d03 ("RDMA/core: Avoid that ib_drain_qp() triggers an out-of-bounds stack access")
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: Steve Wise <swise@opengridcomputing.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d02c9c8b
    • Bart Van Assche's avatar
      RDMA/core: Avoid that ib_drain_qp() triggers an out-of-bounds stack access · 3af61871
      Bart Van Assche authored
      commit a1ae7d03 upstream.
      
      This patch fixes the following KASAN complaint:
      
      ==================================================================
      BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x77d/0x9b0 [rdma_rxe]
      Read of size 8 at addr ffff880061aef860 by task 01/1080
      
      CPU: 2 PID: 1080 Comm: 01 Not tainted 4.16.0-rc3-dbg+ #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
      Call Trace:
      dump_stack+0x85/0xc7
      print_address_description+0x65/0x270
      kasan_report+0x231/0x350
      rxe_post_send+0x77d/0x9b0 [rdma_rxe]
      __ib_drain_sq+0x1ad/0x250 [ib_core]
      ib_drain_qp+0x9/0x30 [ib_core]
      srp_destroy_qp+0x51/0x70 [ib_srp]
      srp_free_ch_ib+0xfc/0x380 [ib_srp]
      srp_create_target+0x1071/0x19e0 [ib_srp]
      kernfs_fop_write+0x180/0x210
      __vfs_write+0xb1/0x2e0
      vfs_write+0xf6/0x250
      SyS_write+0x99/0x110
      do_syscall_64+0xee/0x2b0
      entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      The buggy address belongs to the page:
      page:ffffea000186bbc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
      flags: 0x4000000000000000()
      raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff
      raw: 0000000000000000 ffffea000186bbe0 0000000000000000 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
      ffff880061aef700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      ffff880061aef780: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
      >ffff880061aef800: f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 f2 f2 f2 f2
                                                            ^
      ffff880061aef880: f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 f2 f2
      ffff880061aef900: f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00
      ==================================================================
      
      Fixes: 765d6774 ("IB: new common API for draining queues")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Steve Wise <swise@opengridcomputing.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3af61871
    • Lixin Wang's avatar
      i2c: core: decrease reference count of device node in i2c_unregister_device · c0b86d26
      Lixin Wang authored
      commit e0638fa4 upstream.
      
      Reference count of device node was increased in of_i2c_register_device,
      but without decreasing it in i2c_unregister_device. Then the added
      device node will never be released. Fix this by adding the of_node_put.
      Signed-off-by: default avatarLixin Wang <alan.1.wang@nokia-sbell.com>
      Tested-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Cc: stable@kernel.org
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c0b86d26
    • Kees Cook's avatar
      fork: unconditionally clear stack on fork · 2d5fc7ff
      Kees Cook authored
      commit e01e8063 upstream.
      
      One of the classes of kernel stack content leaks[1] is exposing the
      contents of prior heap or stack contents when a new process stack is
      allocated.  Normally, those stacks are not zeroed, and the old contents
      remain in place.  In the face of stack content exposure flaws, those
      contents can leak to userspace.
      
      Fixing this will make the kernel no longer vulnerable to these flaws, as
      the stack will be wiped each time a stack is assigned to a new process.
      There's not a meaningful change in runtime performance; it almost looks
      like it provides a benefit.
      
      Performing back-to-back kernel builds before:
      	Run times: 157.86 157.09 158.90 160.94 160.80
      	Mean: 159.12
      	Std Dev: 1.54
      
      and after:
      	Run times: 159.31 157.34 156.71 158.15 160.81
      	Mean: 158.46
      	Std Dev: 1.46
      
      Instead of making this a build or runtime config, Andy Lutomirski
      recommended this just be enabled by default.
      
      [1] A noisy search for many kinds of stack content leaks can be seen here:
      https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel+stack+leak
      
      I did some more with perf and cycle counts on running 100,000 execs of
      /bin/true.
      
      before:
      Cycles: 218858861551 218853036130 214727610969 227656844122 224980542841
      Mean:  221015379122.60
      Std Dev: 4662486552.47
      
      after:
      Cycles: 213868945060 213119275204 211820169456 224426673259 225489986348
      Mean:  217745009865.40
      Std Dev: 5935559279.99
      
      It continues to look like it's faster, though the deviation is rather
      wide, but I'm not sure what I could do that would be less noisy.  I'm
      open to ideas!
      
      Link: http://lkml.kernel.org/r/20180221021659.GA37073@beastSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d5fc7ff
  2. 28 Jul, 2018 18 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.59 · 53208e12
      Greg Kroah-Hartman authored
      53208e12
    • Arnd Bergmann's avatar
      turn off -Wattribute-alias · e94f784f
      Arnd Bergmann authored
      Starting with gcc-8.1, we get a warning about all system call definitions,
      which use an alias between functions with incompatible prototypes, e.g.:
      
      In file included from ../mm/process_vm_access.c:19:
      ../include/linux/syscalls.h:211:18: warning: 'sys_process_vm_readv' alias between functions of incompatible types 'long int(pid_t,  const struct iovec *, long unsigned int,  const struct iovec *, long unsigned int,  long unsigned int)' {aka 'long int(int,  const struct iovec *, long unsigned int,  const struct iovec *, long unsigned int,  long unsigned int)'} and 'long int(long int,  long int,  long int,  long int,  long int,  long int)' [-Wattribute-alias]
        asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
                        ^~~
      ../include/linux/syscalls.h:207:2: note: in expansion of macro '__SYSCALL_DEFINEx'
        __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
        ^~~~~~~~~~~~~~~~~
      ../include/linux/syscalls.h:201:36: note: in expansion of macro 'SYSCALL_DEFINEx'
       #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
                                          ^~~~~~~~~~~~~~~
      ../mm/process_vm_access.c:300:1: note: in expansion of macro 'SYSCALL_DEFINE6'
       SYSCALL_DEFINE6(process_vm_readv, pid_t, pid, const struct iovec __user *, lvec,
       ^~~~~~~~~~~~~~~
      ../include/linux/syscalls.h:215:18: note: aliased declaration here
        asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
                        ^~~
      ../include/linux/syscalls.h:207:2: note: in expansion of macro '__SYSCALL_DEFINEx'
        __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
        ^~~~~~~~~~~~~~~~~
      ../include/linux/syscalls.h:201:36: note: in expansion of macro 'SYSCALL_DEFINEx'
       #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
                                          ^~~~~~~~~~~~~~~
      ../mm/process_vm_access.c:300:1: note: in expansion of macro 'SYSCALL_DEFINE6'
       SYSCALL_DEFINE6(process_vm_readv, pid_t, pid, const struct iovec __user *, lvec,
      
      This is really noisy and does not indicate a real problem. In the latest
      mainline kernel, this was addressed by commit bee20031 ("disable
      -Wattribute-alias warning for SYSCALL_DEFINEx()"), which seems too invasive
      to backport.
      
      This takes a much simpler approach and just disables the warning across the
      kernel.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e94f784f
    • Roman Fietze's avatar
      can: m_can.c: fix setup of CCCR register: clear CCCR NISO bit before checking can.ctrlmode · 08382d3a
      Roman Fietze authored
      commit 393753b2 upstream.
      
      Inside m_can_chip_config(), when setting up the new value of the CCCR,
      the CCCR_NISO bit is not cleared like the others, CCCR_TEST, CCCR_MON,
      CCCR_BRSE and CCCR_FDOE, before checking the can.ctrlmode bits for
      CAN_CTRLMODE_FD_NON_ISO.
      
      This way once the controller was configured for CAN_CTRLMODE_FD_NON_ISO,
      this mode could never be cleared again.
      
      This fix is only relevant for controllers with version 3.1.x or 3.2.x.
      Older versions do not support NISO.
      Signed-off-by: default avatarRoman Fietze <roman.fietze@telemotive.de>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08382d3a
    • Stephane Grosjean's avatar
      can: peak_canfd: fix firmware < v3.3.0: limit allocation to 32-bit DMA addr only · a55d3d73
      Stephane Grosjean authored
      commit 5d4c94ed upstream.
      
      The DMA logic in firmwares < v3.3.0 embedded in the PCAN-PCIe FD cards
      family is not capable of handling a mix of 32-bit and 64-bit logical
      addresses. If the board is equipped with 2 or 4 CAN ports, then such a
      situation might lead to a PCIe Bus Error "Malformed TLP" packet
      as well as "irq xx: nobody cared" issue.
      
      This patch adds a workaround that requests only 32-bit DMA addresses
      when these might be allocated outside of the 4 GB area.
      
      This issue has been fixed in firmware v3.3.0 and next.
      Signed-off-by: default avatarStephane Grosjean <s.grosjean@peak-system.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a55d3d73
    • Anssi Hannula's avatar
      can: xilinx_can: fix RX overflow interrupt not being enabled · 60454a97
      Anssi Hannula authored
      commit 83997997 upstream.
      
      RX overflow interrupt (RXOFLW) is disabled even though xcan_interrupt()
      processes it. This means that an RX overflow interrupt will only be
      processed when another interrupt gets asserted (e.g. for RX/TX).
      
      Fix that by enabling the RXOFLW interrupt.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60454a97
    • Anssi Hannula's avatar
      can: xilinx_can: fix incorrect clear of non-processed interrupts · 19c756e0
      Anssi Hannula authored
      commit 2f4f0f33 upstream.
      
      xcan_interrupt() clears ERROR|RXOFLV|BSOFF|ARBLST interrupts if any of
      them is asserted. This does not take into account that some of them
      could have been asserted between interrupt status read and interrupt
      clear, therefore clearing them without handling them.
      
      Fix the code to only clear those interrupts that it knows are asserted
      and therefore going to be processed in xcan_err_interrupt().
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19c756e0
    • Anssi Hannula's avatar
      can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting · 189c7890
      Anssi Hannula authored
      commit 620050d9 upstream.
      
      The xilinx_can driver assumes that the TXOK interrupt only clears after
      it has been acknowledged as many times as there have been successfully
      sent frames.
      
      However, the documentation does not mention such behavior, instead
      saying just that the interrupt is cleared when the clear bit is set.
      
      Similarly, testing seems to also suggest that it is immediately cleared
      regardless of the amount of frames having been sent. Performing some
      heavy TX load and then going back to idle has the tx_head drifting
      further away from tx_tail over time, steadily reducing the amount of
      frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
      interrupt always frees up space for 1 frame from the driver's
      perspective, so frames continue to be sent) and delaying the local echo
      frames.
      
      The TX FIFO tracking is also otherwise buggy as it does not account for
      TX FIFO being cleared after software resets, causing
        BUG!, TX FIFO full when queue awake!
      messages to be output.
      
      There does not seem to be any way to accurately track the state of the
      TX FIFO for local echo support while using the full TX FIFO.
      
      The Zynq version of the HW (but not the soft-AXI version) has watermark
      programming support and with it an additional TX-FIFO-empty interrupt
      bit.
      
      Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI
      and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used
      to detect whether 1 or 2 frames have been sent at interrupt processing
      time.
      
      Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode
      was also tested.
      
      An alternative way to solve this would be to drop local echo support but
      keep using the full TX FIFO.
      
      v2: Add FIFO space check before TX queue wake with locking to
      synchronize with queue stop. This avoids waking the queue when xmit()
      had just filled it.
      
      v3: Keep local echo support and reduce the amount of frames in FIFO
      instead as suggested by Marc Kleine-Budde.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      189c7890
    • Anssi Hannula's avatar
      can: xilinx_can: fix device dropping off bus on RX overrun · 96bf3257
      Anssi Hannula authored
      commit 2574fe54 upstream.
      
      The xilinx_can driver performs a software reset when an RX overrun is
      detected. This causes the device to enter Configuration mode where no
      messages are received or transmitted.
      
      The documentation does not mention any need to perform a reset on an RX
      overrun, and testing by inducing an RX overflow also indicated that the
      device continues to work just fine without a reset.
      
      Remove the software reset.
      
      Tested with the integrated CAN on Zynq-7000 SoC.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96bf3257
    • Anssi Hannula's avatar
      can: xilinx_can: fix recovery from error states not being propagated · c5846b2f
      Anssi Hannula authored
      commit 877e0b75 upstream.
      
      The xilinx_can driver contains no mechanism for propagating recovery
      from CAN_STATE_ERROR_WARNING and CAN_STATE_ERROR_PASSIVE.
      
      Add such a mechanism by factoring the handling of
      XCAN_STATE_ERROR_PASSIVE and XCAN_STATE_ERROR_WARNING out of
      xcan_err_interrupt and checking for recovery after RX and TX if the
      interface is in one of those states.
      
      Tested with the integrated CAN on Zynq-7000 SoC.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5846b2f
    • Anssi Hannula's avatar
      can: xilinx_can: fix power management handling · f820de2a
      Anssi Hannula authored
      commit 8ebd83bd upstream.
      
      There are several issues with the suspend/resume handling code of the
      driver:
      
      - The device is attached and detached in the runtime_suspend() and
        runtime_resume() callbacks if the interface is running. However,
        during xcan_chip_start() the interface is considered running,
        causing the resume handler to incorrectly call netif_start_queue()
        at the beginning of xcan_chip_start(), and on xcan_chip_start() error
        return the suspend handler detaches the device leaving the user
        unable to bring-up the device anymore.
      
      - The device is not brought properly up on system resume. A reset is
        done and the code tries to determine the bus state after that.
        However, after reset the device is always in Configuration mode
        (down), so the state checking code does not make sense and
        communication will also not work.
      
      - The suspend callback tries to set the device to sleep mode (low-power
        mode which monitors the bus and brings the device back to normal mode
        on activity), but then immediately disables the clocks (possibly
        before the device reaches the sleep mode), which does not make sense
        to me. If a clean shutdown is wanted before disabling clocks, we can
        just bring it down completely instead of only sleep mode.
      
      Reorganize the PM code so that only the clock logic remains in the
      runtime PM callbacks and the system PM callbacks contain the device
      bring-up/down logic. This makes calling the runtime PM callbacks during
      e.g. xcan_chip_start() safe.
      
      The system PM callbacks now simply call common code to start/stop the
      HW if the interface was running, replacing the broken code from before.
      
      xcan_chip_stop() is updated to use the common reset code so that it will
      wait for the reset to complete. Reset also disables all interrupts so do
      not do that separately.
      
      Also, the device_may_wakeup() checks are removed as the driver does not
      have wakeup support.
      
      Tested on Zynq-7000 integrated CAN.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f820de2a
    • Anssi Hannula's avatar
      can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK · 464a3f91
      Anssi Hannula authored
      commit 32852c56 upstream.
      
      If the device gets into a state where RXNEMP (RX FIFO not empty)
      interrupt is asserted without RXOK (new frame received successfully)
      interrupt being asserted, xcan_rx_poll() will continue to try to clear
      RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is
      not empty, the interrupt will not be cleared and napi_schedule() will
      just be called again.
      
      This situation can occur when:
      
      (a) xcan_rx() returns without reading RX FIFO due to an error condition.
      The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear
      due to a frame still being in the FIFO. The frame will never be read
      from the FIFO as RXOK is no longer set.
      
      (b) A frame is received between xcan_rx_poll() reading interrupt status
      and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain
      set as the new message is still in the FIFO.
      
      I'm able to trigger case (b) by flooding the bus with frames under load.
      
      There does not seem to be any benefit in using both RXNEMP and RXOK in
      the way the driver does, and the polling example in the reference manual
      (UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either
      RXOK or RXNEMP can be used for detecting incoming messages.
      
      Fix the issue and simplify the RX processing by only using RXNEMP
      without RXOK.
      
      Tested with the integrated CAN on Zynq-7000 SoC.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      464a3f91
    • Rafael J. Wysocki's avatar
      driver core: Partially revert "driver core: correct device's shutdown order" · 55cb8f40
      Rafael J. Wysocki authored
      commit 722e5f2b upstream.
      
      Commit 52cdbdd4 (driver core: correct device's shutdown order)
      introduced a regression by breaking device shutdown on some systems.
      
      Namely, the devices_kset_move_last() call in really_probe() added by
      that commit is a mistake as it may cause parents to follow children
      in the devices_kset list which then causes shutdown to fail.  For
      example, if a device has children before really_probe() is called
      for it (which is not uncommon), that call will cause it to be
      reordered after the children in the devices_kset list and the
      ordering of that list will not reflect the correct device shutdown
      order any more.
      
      Also it causes the devices_kset list to be constantly reordered
      until all drivers have been probed which is totally pointless
      overhead in the majority of cases and it only covered an issue
      with system shutdown, while system-wide suspend/resume potentially
      had the same issue on the affected platforms (which was not covered).
      
      Moreover, the shutdown issue originally addressed by the change in
      really_probe() made by commit 52cdbdd4 is not present in 4.18-rc
      any more, since dra7 started to use the sdhci-omap driver which
      doesn't disable any regulators during shutdown, so the really_probe()
      part of commit 52cdbdd4 can be safely reverted.  [The original
      issue was related to the omap_hsmmc driver used by dra7 previously.]
      
      For the above reasons, revert the really_probe() modifications made
      by commit 52cdbdd4.
      
      The other code changes made by commit 52cdbdd4 are useful and
      they need not be reverted.
      
      Fixes: 52cdbdd4 (driver core: correct device's shutdown order)
      Link: https://lore.kernel.org/lkml/CAFgQCTt7VfqM=UyCnvNFxrSw8Z6cUtAi3HUwR4_xPAc03SgHjQ@mail.gmail.com/Reported-by: default avatarPingfan Liu <kernelfans@gmail.com>
      Tested-by: default avatarPingfan Liu <kernelfans@gmail.com>
      Reviewed-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55cb8f40
    • Jerry Zhang's avatar
      usb: gadget: f_fs: Only return delayed status when len is 0 · 5421694d
      Jerry Zhang authored
      commit 4d644abf upstream.
      
      Commit 1b9ba000 ("Allow function drivers to pause control
      transfers") states that USB_GADGET_DELAYED_STATUS is only
      supported if data phase is 0 bytes.
      
      It seems that when the length is not 0 bytes, there is no
      need to explicitly delay the data stage since the transfer
      is not completed until the user responds. However, when the
      length is 0, there is no data stage and the transfer is
      finished once setup() returns, hence there is a need to
      explicitly delay completion.
      
      This manifests as the following bugs:
      
      Prior to 946ef68a ('Let setup() return
      USB_GADGET_DELAYED_STATUS'), when setup is 0 bytes, ffs
      would require user to queue a 0 byte request in order to
      clear setup state. However, that 0 byte request was actually
      not needed and would hang and cause errors in other setup
      requests.
      
      After the above commit, 0 byte setups work since the gadget
      now accepts empty queues to ep0 to clear the delay, but all
      other setups hang.
      
      Fixes: 946ef68a ("Let setup() return USB_GADGET_DELAYED_STATUS")
      Signed-off-by: default avatarJerry Zhang <zhangjerry@google.com>
      Cc: stable <stable@vger.kernel.org>
      Acked-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5421694d
    • Antti Seppälä's avatar
      usb: dwc2: Fix DMA alignment to start at allocated boundary · 68fc92a0
      Antti Seppälä authored
      commit 56406e01 upstream.
      
      The commit 3bc04e28 ("usb: dwc2: host: Get aligned DMA in a more
      supported way") introduced a common way to align DMA allocations.
      The code in the commit aligns the struct dma_aligned_buffer but the
      actual DMA address pointed by data[0] gets aligned to an offset from
      the allocated boundary by the kmalloc_ptr and the old_xfer_buffer
      pointers.
      
      This is against the recommendation in Documentation/DMA-API.txt which
      states:
      
        Therefore, it is recommended that driver writers who don't take
        special care to determine the cache line size at run time only map
        virtual regions that begin and end on page boundaries (which are
        guaranteed also to be cache line boundaries).
      
      The effect of this is that architectures with non-coherent DMA caches
      may run into memory corruption or kernel crashes with Unhandled
      kernel unaligned accesses exceptions.
      
      Fix the alignment by positioning the DMA area in front of the allocation
      and use memory at the end of the area for storing the orginal
      transfer_buffer pointer. This may have the added benefit of increased
      performance as the DMA area is now fully aligned on all architectures.
      
      Tested with Lantiq xRX200 (MIPS) and RPi Model B Rev 2 (ARM).
      
      Fixes: 3bc04e28 ("usb: dwc2: host: Get aligned DMA in a more supported way")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarAntti Seppälä <a.seppala@gmail.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68fc92a0
    • Bin Liu's avatar
      usb: core: handle hub C_PORT_OVER_CURRENT condition · ac3f65c6
      Bin Liu authored
      commit 249a32b7 upstream.
      
      Based on USB2.0 Spec Section 11.12.5,
      
        "If a hub has per-port power switching and per-port current limiting,
        an over-current on one port may still cause the power on another port
        to fall below specific minimums. In this case, the affected port is
        placed in the Power-Off state and C_PORT_OVER_CURRENT is set for the
        port, but PORT_OVER_CURRENT is not set."
      
      so let's check C_PORT_OVER_CURRENT too for over current condition.
      
      Fixes: 08d1dec6 ("usb:hub set hub->change_bits when over-current happens")
      Cc: <stable@vger.kernel.org>
      Tested-by: default avatarAlessandro Antenucci <antenucci@korg.it>
      Signed-off-by: default avatarBin Liu <b-liu@ti.com>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac3f65c6
    • Lubomir Rintel's avatar
      usb: cdc_acm: Add quirk for Castles VEGA3000 · e089c305
      Lubomir Rintel authored
      commit 1445cbe4 upstream.
      
      The device (a POS terminal) implements CDC ACM, but has not union
      descriptor.
      Signed-off-by: default avatarLubomir Rintel <lkundrak@v3.sk>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e089c305
    • Samuel Thibault's avatar
      staging: speakup: fix wraparound in uaccess length check · ab9489c4
      Samuel Thibault authored
      commit b96fba8d upstream.
      
      If softsynthx_read() is called with `count < 3`, `count - 3` wraps, causing
      the loop to copy as much data as available to the provided buffer. If
      softsynthx_read() is invoked through sys_splice(), this causes an
      unbounded kernel write; but even when userspace just reads from it
      normally, a small size could cause userspace crashes.
      
      Fixes: 425e586c ("speakup: add unicode variant of /dev/softsynth")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSamuel Thibault <samuel.thibault@ens-lyon.org>
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab9489c4
    • Eric Dumazet's avatar
      tcp: add tcp_ooo_try_coalesce() helper · 22e3d317
      Eric Dumazet authored
      [ Upstream commit 58152ecb ]
      
      In case skb in out_or_order_queue is the result of
      multiple skbs coalescing, we would like to get a proper gso_segs
      counter tracking, so that future tcp_drop() can report an accurate
      number.
      
      I chose to not implement this tracking for skbs in receive queue,
      since they are not dropped, unless socket is disconnected.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22e3d317