1. 13 Nov, 2018 32 commits
  2. 10 Nov, 2018 8 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.80 · 0b047cbc
      Greg Kroah-Hartman authored
      0b047cbc
    • Christophe Leroy's avatar
      net: fs_enet: do not call phy_stop() in interrupts · 3a2b1d50
      Christophe Leroy authored
      [ Upstream commit f8b39039 ]
      
      In case of TX timeout, fs_timeout() calls phy_stop(), which
      triggers the following BUG_ON() as we are in interrupt.
      
      [92708.199889] kernel BUG at drivers/net/phy/mdio_bus.c:482!
      [92708.204985] Oops: Exception in kernel mode, sig: 5 [#1]
      [92708.210119] PREEMPT
      [92708.212107] CMPC885
      [92708.214216] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G        W       4.9.61 #39
      [92708.223227] task: c60f0a40 task.stack: c6104000
      [92708.227697] NIP: c02a84bc LR: c02a947c CTR: c02a93d8
      [92708.232614] REGS: c6105c70 TRAP: 0700   Tainted: G        W        (4.9.61)
      [92708.241193] MSR: 00021032 <ME,IR,DR,RI>[92708.244818]   CR: 24000822  XER: 20000000
      [92708.248767]
      GPR00: c02a947c c6105d20 c60f0a40 c62b4c00 00000005 0000001f c069aad8 0001a688
      GPR08: 00000007 00000100 c02a93d8 00000000 000005fc 00000000 c6213240 c06338e4
      GPR16: 00000001 c06330d4 c0633094 00000000 c0680000 c6104000 c6104000 00000000
      GPR24: 00000200 00000000 ffffffff 00000004 00000078 00009032 00000000 c62b4c00
      NIP [c02a84bc] mdiobus_read+0x20/0x74
      [92708.281517] LR [c02a947c] kszphy_config_intr+0xa4/0xc4
      [92708.286547] Call Trace:
      [92708.288980] [c6105d20] [c6104000] 0xc6104000 (unreliable)
      [92708.294339] [c6105d40] [c02a947c] kszphy_config_intr+0xa4/0xc4
      [92708.300098] [c6105d50] [c02a5330] phy_stop+0x60/0x9c
      [92708.305007] [c6105d60] [c02c84d0] fs_timeout+0xdc/0x110
      [92708.310197] [c6105d80] [c035cd48] dev_watchdog+0x268/0x2a0
      [92708.315593] [c6105db0] [c0060288] call_timer_fn+0x34/0x17c
      [92708.321014] [c6105dd0] [c00605f0] run_timer_softirq+0x21c/0x2e4
      [92708.326887] [c6105e50] [c001e19c] __do_softirq+0xf4/0x2f4
      [92708.332207] [c6105eb0] [c001e3c8] run_ksoftirqd+0x2c/0x40
      [92708.337560] [c6105ec0] [c003b420] smpboot_thread_fn+0x1f0/0x258
      [92708.343405] [c6105ef0] [c003745c] kthread+0xbc/0xd0
      [92708.348217] [c6105f40] [c000c400] ret_from_kernel_thread+0x5c/0x64
      [92708.354275] Instruction dump:
      [92708.357207] 7c0803a6 bbc10018 38210020 4e800020 7c0802a6 9421ffe0 54290024 bfc10018
      [92708.364865] 90010024 7c7f1b78 81290008 552902ee <0f090000> 3bc3002c 7fc3f378 90810008
      [92708.372711] ---[ end trace 42b05441616fafd7 ]---
      
      This patch moves fs_timeout() actions into an async worker.
      
      Fixes: commit 48257c4f ("Add fs_enet ethernet network driver, for several embedded platforms")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3a2b1d50
    • Sebastian Andrzej Siewior's avatar
      x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context... · ce4454ff
      Sebastian Andrzej Siewior authored
      x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context switch if there is an FPU
      
      commit 2224d616 upstream.
      
      Booting an i486 with "no387 nofxsr" ends with with the following crash:
      
         math_emulate: 0060:c101987d
         Kernel panic - not syncing: Math emulation needed in kernel
      
      on the first context switch in user land.
      
      The reason is that copy_fpregs_to_fpstate() tries FNSAVE which does not work
      as the FPU is turned off.
      
      This bug was introduced in:
      
        f1c8cd01 ("x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active")
      
      Add a check for X86_FEATURE_FPU before trying to save FPU registers (we
      have such a check in switch_fpu_finish() already).
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: f1c8cd01 ("x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active")
      Link: http://lkml.kernel.org/r/20181016202525.29437-4-bigeasy@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      ce4454ff
    • Nathan Chancellor's avatar
      x86/time: Correct the attribute on jiffies' definition · 70931375
      Nathan Chancellor authored
      commit 53c13ba8 upstream.
      
      Clang warns that the declaration of jiffies in include/linux/jiffies.h
      doesn't match the definition in arch/x86/time/kernel.c:
      
      arch/x86/kernel/time.c:29:42: warning: section does not match previous declaration [-Wsection]
      __visible volatile unsigned long jiffies __cacheline_aligned = INITIAL_JIFFIES;
                                               ^
      ./include/linux/cache.h:49:4: note: expanded from macro '__cacheline_aligned'
                       __section__(".data..cacheline_aligned")))
                       ^
      ./include/linux/jiffies.h:81:31: note: previous attribute is here
      extern unsigned long volatile __cacheline_aligned_in_smp __jiffy_arch_data jiffies;
                                    ^
      ./arch/x86/include/asm/cache.h:20:2: note: expanded from macro '__cacheline_aligned_in_smp'
              __page_aligned_data
              ^
      ./include/linux/linkage.h:39:29: note: expanded from macro '__page_aligned_data'
      #define __page_aligned_data     __section(.data..page_aligned) __aligned(PAGE_SIZE)
                                      ^
      ./include/linux/compiler_attributes.h:233:56: note: expanded from macro '__section'
      #define __section(S)                    __attribute__((__section__(#S)))
                                                             ^
      1 warning generated.
      
      The declaration was changed in commit 7c30f352 ("jiffies.h: declare
      jiffies and jiffies_64 with ____cacheline_aligned_in_smp") but wasn't
      updated here. Make them match so Clang no longer warns.
      
      Fixes: 7c30f352 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp")
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181013005311.28617-1-natechancellor@gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      70931375
    • Peter Zijlstra's avatar
      x86/percpu: Fix this_cpu_read() · 577bfbe3
      Peter Zijlstra authored
      commit b59167ac upstream.
      
      Eric reported that a sequence count loop using this_cpu_read() got
      optimized out. This is wrong, this_cpu_read() must imply READ_ONCE()
      because the interface is IRQ-safe, therefore an interrupt can have
      changed the per-cpu value.
      
      Fixes: 7c3576d2 ("[PATCH] i386: Convert PDA into the percpu section")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: hpa@zytor.com
      Cc: eric.dumazet@gmail.com
      Cc: bp@alien8.de
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181011104019.748208519@infradead.orgSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      577bfbe3
    • Zhimin Gu's avatar
      x86, hibernate: Fix nosave_regions setup for hibernation · 7f6273f5
      Zhimin Gu authored
      commit cc55f753 upstream.
      
      On 32bit systems, nosave_regions(non RAM areas) located between
      max_low_pfn and max_pfn are not excluded from hibernation snapshot
      currently, which may result in a machine check exception when
      trying to access these unsafe regions during hibernation:
      
      [  612.800453] Disabling lock debugging due to kernel taint
      [  612.805786] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 6: fe00000000801136
      [  612.814344] mce: [Hardware Error]: RIP !INEXACT! 60:<00000000d90be566> {swsusp_save+0x436/0x560}
      [  612.823167] mce: [Hardware Error]: TSC 1f5939fe276 ADDR dd000000 MISC 30e0000086
      [  612.830677] mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1529487426 SOCKET 0 APIC 0 microcode 24
      [  612.839581] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
      [  612.846394] mce: [Hardware Error]: Machine check: Processor context corrupt
      [  612.853380] Kernel panic - not syncing: Fatal machine check
      [  612.858978] Kernel Offset: 0x18000000 from 0xc1000000 (relocation range: 0xc0000000-0xf7ffdfff)
      
      This is because on 32bit systems, pages above max_low_pfn are regarded
      as high memeory, and accessing unsafe pages might cause expected MCE.
      On the problematic 32bit system, there are reserved memory above low
      memory, which triggered the MCE:
      
      e820 memory mapping:
      [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable
      [    0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
      [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000d160cfff] usable
      [    0.000000] BIOS-e820: [mem 0x00000000d160d000-0x00000000d1613fff] ACPI NVS
      [    0.000000] BIOS-e820: [mem 0x00000000d1614000-0x00000000d1a44fff] usable
      [    0.000000] BIOS-e820: [mem 0x00000000d1a45000-0x00000000d1ecffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000d1ed0000-0x00000000d7eeafff] usable
      [    0.000000] BIOS-e820: [mem 0x00000000d7eeb000-0x00000000d7ffffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000d8000000-0x00000000d875ffff] usable
      [    0.000000] BIOS-e820: [mem 0x00000000d8760000-0x00000000d87fffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000d8800000-0x00000000d8fadfff] usable
      [    0.000000] BIOS-e820: [mem 0x00000000d8fae000-0x00000000d8ffffff] ACPI data
      [    0.000000] BIOS-e820: [mem 0x00000000d9000000-0x00000000da71bfff] usable
      [    0.000000] BIOS-e820: [mem 0x00000000da71c000-0x00000000da7fffff] ACPI NVS
      [    0.000000] BIOS-e820: [mem 0x00000000da800000-0x00000000dbb8bfff] usable
      [    0.000000] BIOS-e820: [mem 0x00000000dbb8c000-0x00000000dbffffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000dd000000-0x00000000df1fffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
      [    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
      [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000041edfffff] usable
      
      Fix this problem by changing pfn limit from max_low_pfn to max_pfn.
      This fix does not impact 64bit system because on 64bit max_low_pfn
      is the same as max_pfn.
      Signed-off-by: default avatarZhimin Gu <kookoo.gu@intel.com>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarChen Yu <yu.c.chen@intel.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: All applicable <stable@vger.kernel.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7f6273f5
    • Peter Zijlstra's avatar
      x86/tsc: Force inlining of cyc2ns bits · 1595a964
      Peter Zijlstra authored
      commit 4907c68a upstream.
      
      Looking at the asm for native_sched_clock() I noticed we don't inline
      enough. Mostly caused by sharing code with cyc2ns_read_begin(), which
      we didn't used to do. So mark all that __force_inline to make it DTRT.
      
      Fixes: 59eaef78 ("x86/tsc: Remodel cyc2ns to use seqcount_latch()")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: hpa@zytor.com
      Cc: eric.dumazet@gmail.com
      Cc: bp@alien8.de
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181011104019.695196158@infradead.orgSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1595a964
    • Phil Auld's avatar
      sched/fair: Fix throttle_list starvation with low CFS quota · 1220de22
      Phil Auld authored
      commit baa9be4f upstream.
      
      With a very low cpu.cfs_quota_us setting, such as the minimum of 1000,
      distribute_cfs_runtime may not empty the throttled_list before it runs
      out of runtime to distribute. In that case, due to the change from
      c06f04c7 to put throttled entries at the head of the list, later entries
      on the list will starve.  Essentially, the same X processes will get pulled
      off the list, given CPU time and then, when expired, get put back on the
      head of the list where distribute_cfs_runtime will give runtime to the same
      set of processes leaving the rest.
      
      Fix the issue by setting a bit in struct cfs_bandwidth when
      distribute_cfs_runtime is running, so that the code in throttle_cfs_rq can
      decide to put the throttled entry on the tail or the head of the list.  The
      bit is set/cleared by the callers of distribute_cfs_runtime while they hold
      cfs_bandwidth->lock.
      
      This is easy to reproduce with a handful of CPU consumers. I use 'crash' on
      the live system. In some cases you can simply look at the throttled list and
      see the later entries are not changing:
      
        crash> list cfs_rq.throttled_list -H 0xffff90b54f6ade40 -s cfs_rq.runtime_remaining | paste - - | awk '{print $1"  "$4}' | pr -t -n3
          1     ffff90b56cb2d200  -976050
          2     ffff90b56cb2cc00  -484925
          3     ffff90b56cb2bc00  -658814
          4     ffff90b56cb2ba00  -275365
          5     ffff90b166a45600  -135138
          6     ffff90b56cb2da00  -282505
          7     ffff90b56cb2e000  -148065
          8     ffff90b56cb2fa00  -872591
          9     ffff90b56cb2c000  -84687
         10     ffff90b56cb2f000  -87237
         11     ffff90b166a40a00  -164582
      
        crash> list cfs_rq.throttled_list -H 0xffff90b54f6ade40 -s cfs_rq.runtime_remaining | paste - - | awk '{print $1"  "$4}' | pr -t -n3
          1     ffff90b56cb2d200  -994147
          2     ffff90b56cb2cc00  -306051
          3     ffff90b56cb2bc00  -961321
          4     ffff90b56cb2ba00  -24490
          5     ffff90b166a45600  -135138
          6     ffff90b56cb2da00  -282505
          7     ffff90b56cb2e000  -148065
          8     ffff90b56cb2fa00  -872591
          9     ffff90b56cb2c000  -84687
         10     ffff90b56cb2f000  -87237
         11     ffff90b166a40a00  -164582
      
      Sometimes it is easier to see by finding a process getting starved and looking
      at the sched_info:
      
        crash> task ffff8eb765994500 sched_info
        PID: 7800   TASK: ffff8eb765994500  CPU: 16  COMMAND: "cputest"
          sched_info = {
            pcount = 8,
            run_delay = 697094208,
            last_arrival = 240260125039,
            last_queued = 240260327513
          },
        crash> task ffff8eb765994500 sched_info
        PID: 7800   TASK: ffff8eb765994500  CPU: 16  COMMAND: "cputest"
          sched_info = {
            pcount = 8,
            run_delay = 697094208,
            last_arrival = 240260125039,
            last_queued = 240260327513
          },
      Signed-off-by: default avatarPhil Auld <pauld@redhat.com>
      Reviewed-by: default avatarBen Segall <bsegall@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: c06f04c7 ("sched: Fix potential near-infinite distribute_cfs_runtime() loop")
      Link: http://lkml.kernel.org/r/20181008143639.GA4019@pauld.bos.csbSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1220de22