1. 13 Nov, 2017 7 commits
    • Arnaldo Carvalho de Melo's avatar
      perf record: Generate PERF_RECORD_{MMAP,COMM,EXEC} with --delay · d3dbf43c
      Arnaldo Carvalho de Melo authored
      When we use an initial delay, e.g.: 'perf record --delay 1000', we do not
      enable the events until that delay has passed after we started the workload,
      including the tracking event, i.e. the one for which we have attr.mmap, etc,
      enabled to ask the kernel to generate the PERF_RECORD_{MMAP,COMM,EXEC} metadata
      events that will then allow us to resolve addresses in samples to the map, dso
      and symbol. There will be a shadow that even synthesizing samples won't cover,
      i.e. the workload that we start and other processes forking while we
      wait for the initial delay to expire.
      
      So use a dummy event to be the tracking one and make it be enabled on exec.
      
      Before:
      
        # perf record --delay 1000 stress --cpu 1 --timeout 5
        stress: info: [9029] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
        stress: info: [9029] successful run completed in 5s
        [ perf record: Woken up 3 times to write data ]
        [ perf record: Captured and wrote 0.624 MB perf.data (15908 samples) ]
        # perf script | head
            :9031 9031 32001.826888:       1 cycles:ppp: ffffffff831aa30d event_function (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826893:       1 cycles:ppp: ffffffff8300d1a0 intel_bts_enable_local (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826895:       7 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826897:     103 cycles:ppp: ffffffff8300c331 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826899:    1615 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826902:   26724 cycles:ppp: ffffffff8384c6a7 native_irq_return_iret (/lib/modules/4.14.0-rc6+/build/vmlinux)
            :9031 9031 32001.826913:  329739 cycles:ppp:     7fb2a5410932 [unknown] ([unknown])
            :9031 9031 32001.827033: 1225451 cycles:ppp:     7fb2a5410930 [unknown] ([unknown])
            :9031 9031 32001.827474: 1391725 cycles:ppp:     7fb2a5410930 [unknown] ([unknown])
            :9031 9031 32001.827978: 1233697 cycles:ppp:     7fb2a5410928 [unknown] ([unknown])
        #
      
      After:
      
        # perf record --delay 1000 stress --cpu 1 --timeout 5
        stress: info: [9741] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
        stress: info: [9741] successful run completed in 5s
        [ perf record: Woken up 3 times to write data ]
        [ perf record: Captured and wrote 0.751 MB perf.data (15976 samples) ]
        # perf script | head
           stress  9742 32110.959106:          1 cycles:ppp:  ffffffff831b26f6 __perf_event_task_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959110:       1 cycles:ppp: ffffffff8300c2e9 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959112:       7 cycles:ppp: ffffffff830231e0 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959115:     101 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959117:    1533 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959119:   23992 cycles:ppp: ffffffff831b0900 ctx_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
           stress 9742 32110.959129:  329406 cycles:ppp:     7f4b1b661930 __random_r (/usr/lib64/libc-2.25.so)
           stress 9742 32110.959249: 1288322 cycles:ppp:     5566e1e7cbc9 hogcpu (/usr/bin/stress)
           stress 9742 32110.959712: 1464046 cycles:ppp:     7f4b1b66179e __random (/usr/lib64/libc-2.25.so)
           stress 9742 32110.960241: 1266918 cycles:ppp:     7f4b1b66195b __random_r (/usr/lib64/libc-2.25.so)
        #
      Reported-by: default avatarBram Stolk <b.stolk@gmail.com>
      Tested-by: default avatarBram Stolk <b.stolk@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 6619a53e ("perf record: Add --initial-delay option")
      Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3dbf43c
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Set the correct idx when adding dummy events · 640d5175
      Arnaldo Carvalho de Melo authored
      The evsel->idx field is used mainly to access the right bucket in
      per-event arrays such as the annotation ones, but also to set
      evsel->tracking, that in turn will decide what of the events will ask
      for PERF_RECORD_{MMAP,COMM,EXEC} to be generated, i.e. which
      perf_event_attr will have its mmap, etc fields set.
      
      When we were adding the "dummy" event using perf_evlist__add_dummy() we
      were not setting it correctly, which could result in multiple tracking
      events.
      
      Now that I'll try using a dummy event to be the tracking one when using
      'perf record --delay', i.e. when we process the --delay
      setting we may already have the evlist set up, like with:
      
        perf record -e cycles,instructions --delay 1000 ./workload
      
      We will need to add a "dummy" event, then reset evsel->tracking for the
      first event, "cycles", and set it instead to the dummy one, and also
      setting its attr.enable_on_exec, so that we get the PERF_RECORD_MMAP,
      etc metadata events while waiting to enable the explicitely requested
      events, so lets get this straight and set the right evsel->idx.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Bram Stolk <b.stolk@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      640d5175
    • Arnaldo Carvalho de Melo's avatar
      7862edc4
    • Ingo Molnar's avatar
      kprobes: Don't spam the build log with deprecation warnings · fcdfafcb
      Ingo Molnar authored
      The jprobes APIs are deprecated - but are still in occasional use for code that
      few people seem to care about, so stop generating deprecation warnings.
      
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fcdfafcb
    • Linus Torvalds's avatar
      /proc/module: use the same logic as /proc/kallsyms for address exposure · 516fb7f2
      Linus Torvalds authored
      The (alleged) users of the module addresses are the same: kernel
      profiling.
      
      So just expose the same helper and format macros, and unify the logic.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      516fb7f2
    • Linus Torvalds's avatar
      modules: make sysfs attribute files readable by owner only · 277642dc
      Linus Torvalds authored
      This code goes back to the historical bitkeeper tree commit 3f7b0672
      ("Module section offsets in /sys/module"), where Jonathan Corbet wanted
      to show people how to debug loadable modules.
      
      See
      
          https://lwn.net/Articles/88052/
      
      from June 2004.
      
      To expose the required load address information, Jonathan added the
      sections subdirectory for every module in /sys/modules, and made them
      S_IRUGO - readable by everybody.
      
      It was a more innocent time, plus those S_IRxxx macro names are a lot
      more confusing than the octal numbers are, so maybe it wasn't even
      intentional.  But here we are, thirteen years later, and I'll just change
      it to S_IRUSR instead.
      
      Let's see if anybody even notices.
      
      Cc: Jonathan Corbet <corbet@lwn.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      277642dc
    • Linus Torvalds's avatar
      Merge branch 'kallsyms-restrictions' · 9d560410
      Linus Torvalds authored
      Merge /proc/kallsyms pointer value restrictions.
      
      Instead of using %pK, and making it about root access (at the wrong
      time, no less), make the whole choice of whether to show the actual
      pointer value be very explicit to the kallsyms code.
      
      In particular, we can now default to not doing so, and yet avoid
      annoying kernel profiling by actually looking at whether kernel
      profiling is allowed or not (by default it is not).
      
      This is all mostly preparation for the real "let's stop leaking kernel
      addresses" work that Tobin Harding is working on.
      
      Small steps.
      
      * kallsyms-restrictions:
        stop using '%pK' for /proc/kallsyms pointer values
      9d560410
  2. 12 Nov, 2017 4 commits
    • Randy Dunlap's avatar
      modpost: detect modules without a MODULE_LICENSE · ba1029c9
      Randy Dunlap authored
      Partially revert commit 2fa36568 ("kbuild: soften MODULE_LICENSE
      check") so that modpost detects modules that do not have a
      MODULE_LICENSE.
      
      Sam's commit also changed the fatal error to a warning, which I am
      leaving as is.
      
      This gives advance notice of when a module has no license and will taint
      the kernel if the module is loaded.
      
      This produces the following warnings on x86_64 allmodconfig:
      
          MODPOST 6520 modules
        WARNING: modpost: missing MODULE_LICENSE() in drivers/auxdisplay/img-ascii-lcd.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-ath79.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-iop.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/iio/accel/kxsd9-i2c.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/iio/adc/qcom-vadc-common.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/mtk-vcodec/mtk-vcodec-common.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/mtd/nand/denali_pci.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/net/phy/cortina.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/pinctrl/pxa/pinctrl-pxa2xx.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/power/reset/zx-reboot.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/rpmsg/qcom_glink_native.o
        WARNING: modpost: missing MODULE_LICENSE() in drivers/staging/comedi/drivers/ni_atmio.o
        WARNING: modpost: missing MODULE_LICENSE() in net/9p/9pnet_xen.o
        WARNING: modpost: missing MODULE_LICENSE() in sound/soc/codecs/snd-soc-pcm512x-spi.o
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ba1029c9
    • Linus Torvalds's avatar
      Linux 4.14 · bebc6082
      Linus Torvalds authored
      bebc6082
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 152bbb43
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A set of small fixes:
      
         - make KGDB work again which got broken by the conversion of WARN()
           to #UD. The WARN fixup needs to run before the notifier callchain,
           otherwise KGDB tries to handle it and crashes.
      
         - disable KASAN in the ORC unwinder to prevent false positive KASAN
           warnings
      
         - prevent default mapping above 47bit when 5 level page tables are
           enabled
      
         - make the delay calibration optimization work correctly, which had
           the conditionals the wrong way around and was operating on data
           which was not yet updated.
      
         - remove the bogus X86_TRAP_BP trap init from the default IDT init
           table, which broke 32bit int3 handling by overwriting the correct
           int3 setup.
      
         - replace this_cpu* with boot_cpu_data access in the preemptible
           oprofile init code"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/debug: Handle warnings before the notifier chain, to fix KGDB crash
        x86/mm: Fix ELF_ET_DYN_BASE for 5-level paging
        x86/idt: Remove X86_TRAP_BP initialization in idt_setup_traps()
        x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context
        x86/unwind: Disable KASAN checking in the ORC unwinder
        x86/smpboot: Make optimization of delay calibration work correctly
      152bbb43
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 69581c74
      Linus Torvalds authored
      Pull perf tool fixes from Thomas Gleixner:
       "A small set of fixes for perf tool:
      
         - synchronize the i915 drm header to avoid the 'out of date' warning
      
         - make sure that perf trace cleans up its temporary files on exit
      
         - unbreak the build with newer flex versions
      
         - add missing braces in the eBPF parsing rules"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tooling/headers: Sync the tools/include/uapi/drm/i915_drm.h UAPI header
        perf trace: Call machine__exit() at exit
        perf tools: Fix eBPF event specification parsing
        perf tools: Add "reject" option for parse-events.l
      69581c74
  3. 11 Nov, 2017 8 commits
  4. 10 Nov, 2017 21 commits
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-4.14-rc9' of git://github.com/ceph/ceph-client · ca916599
      Linus Torvalds authored
      Pull ceph gix from Ilya Dryomov:
       "Memory allocation flags fix, marked for stable"
      
      * tag 'ceph-for-4.14-rc9' of git://github.com/ceph/ceph-client:
        rbd: use GFP_NOIO for parent stat and data requests
      ca916599
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 60cfc98b
      Linus Torvalds authored
      Pull input layer updates from Dmitry Torokhov:
      
       - a new ACPI ID for Elan touchpad found in yet another Ideapad model
      
       - Synaptics RMI4 will allow binding to controllers reporting SMB
         version 3 (note that we are not adding any new ACPI IDs to the
         Synaptics PS/2 drover so unless user explicitly enables intertouch
         support there is no user-visible change)
      
       - a fixup to TSC 2004/5 touchscreen driver to mark input devices as
         "direct" to help userspace identify the type of device they are
         dealing with
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: synaptics-rmi4 - RMI4 can also use SMBUS version 3
        Input: tsc200x-core - set INPUT_PROP_DIRECT
        Input: elan_i2c - add ELAN060C to the ACPI table
      60cfc98b
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 5cf2360b
      Linus Torvalds authored
      Pull KVM fix from Radim Krčmář:
       "Fix PPC HV host crash that can occur as a result of resizing the guest
        hashed page table"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates
      5cf2360b
    • Linus Torvalds's avatar
      Merge tag 'mips_fixes_4.14_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips · a579e949
      Linus Torvalds authored
      Pull MIPS fixes from James Hogan:
       "A final few MIPS fixes for 4.14:
      
         - fix BMIPS NULL pointer dereference (4.7)
      
         - fix AR7 early GPIO init allocation failure (3.19)
      
         - fix dead serial output on certain AR7 platforms (2.6.35)"
      
      * tag 'mips_fixes_4.14_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
        MIPS: AR7: Ensure that serial ports are properly set up
        MIPS: AR7: Defer registration of GPIO
        MIPS: BMIPS: Fix missing cbr address
      a579e949
    • Maciej W. Rozycki's avatar
      .mailmap: Add Maciej W. Rozycki's Imagination e-mail address · 085c17ff
      Maciej W. Rozycki authored
      Following my recent transition from Imagination Technologies to the=20
      reincarnated MIPS company add a .mailmap mapping for my work address,
      so that `scripts/get_maintainer.pl' gets it right for past commits.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      085c17ff
    • Linus Torvalds's avatar
      Revert "x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo" · ea0ee339
      Linus Torvalds authored
      This reverts commit 941f5f0f.
      
      Sadly, it turns out that we really can't just do the cross-CPU IPI to
      all CPU's to get their proper frequencies, because it's much too
      expensive on systems with lots of cores.
      
      So we'll have to revert this for now, and revisit it using a smarter
      model (probably doing one system-wide IPI at open time, and doing all
      the frequency calculations in parallel).
      Reported-by: default avatarWANG Chao <chao.wang@ucloud.cn>
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ea0ee339
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.14-rc9' of git://people.freedesktop.org/~airlied/linux · 3e81277a
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Last few patches to wrap up.
      
        Two i915 fixes that are on their way to stable, one vmware black
        screen bug, and one const patch that I was going to drop, but it was
        clearly a pretty safe one liner"
      
      * tag 'drm-fixes-for-v4.14-rc9' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: Deconstruct struct sgt_dma initialiser
        drm/i915: Reject unknown syncobj flags
        drm/vmwgfx: Fix Ubuntu 17.10 Wayland black screen issue
        drm/vmwgfx: constify vmw_fence_ops
      3e81277a
    • Marek Vasut's avatar
      can: ifi: Fix transmitter delay calculation · 4f711675
      Marek Vasut authored
      The CANFD transmitter delay calculation formula was updated in the
      latest software drop from IFI and improves the behavior of the IFI
      CANFD core during bitrate switching. Use the new formula to improve
      stability of the CANFD operation.
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Markus Marb <markus@marb.org>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      4f711675
    • Yuchung Cheng's avatar
      tcp: fix tcp_fastretrans_alert warning · 0eb96bf7
      Yuchung Cheng authored
      This patch fixes the cause of an WARNING indicatng TCP has pending
      retransmission in Open state in tcp_fastretrans_alert().
      
      The root cause is a bad interaction between path mtu probing,
      if enabled, and the RACK loss detection. Upong receiving a SACK
      above the sequence of the MTU probing packet, RACK could mark the
      probe packet lost in tcp_fastretrans_alert(), prior to calling
      tcp_simple_retransmit().
      
      tcp_simple_retransmit() only enters Loss state if it newly marks
      the probe packet lost. If the probe packet is already identified as
      lost by RACK, the sender remains in Open state with some packets
      marked lost and retransmitted. Then the next SACK would trigger
      the warning. The likely scenario is that the probe packet was
      lost due to its size or network congestion. The actual impact of
      this warning is small by potentially entering fast recovery an
      ACK later.
      
      The simple fix is always entering recovery (Loss) state if some
      packet is marked lost during path MTU probing.
      
      Fixes: a0370b3f ("tcp: enable RACK loss detection to trigger recovery")
      Reported-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
      Reported-by: default avatarAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Reported-by: default avatarRoman Gushchin <guro@fb.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0eb96bf7
    • Eric Dumazet's avatar
      tcp: gso: avoid refcount_t warning from tcp_gso_segment() · 7ec318fe
      Eric Dumazet authored
      When a GSO skb of truesize O is segmented into 2 new skbs of truesize N1
      and N2, we want to transfer socket ownership to the new fresh skbs.
      
      In order to avoid expensive atomic operations on a cache line subject to
      cache bouncing, we replace the sequence :
      
      refcount_add(N1, &sk->sk_wmem_alloc);
      refcount_add(N2, &sk->sk_wmem_alloc); // repeated by number of segments
      
      refcount_sub(O, &sk->sk_wmem_alloc);
      
      by a single
      
      refcount_add(sum_of(N) - O, &sk->sk_wmem_alloc);
      
      Problem is :
      
      In some pathological cases, sum(N) - O might be a negative number, and
      syzkaller bot was apparently able to trigger this trace [1]
      
      atomic_t was ok with this construct, but we need to take care of the
      negative delta with refcount_t
      
      [1]
      refcount_t: saturated; leaking memory.
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 8404 at lib/refcount.c:77 refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 0 PID: 8404 Comm: syz-executor2 Not tainted 4.14.0-rc5-mm1+ #20
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:16 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:52
       panic+0x1e4/0x41c kernel/panic.c:183
       __warn+0x1c4/0x1e0 kernel/panic.c:546
       report_bug+0x211/0x2d0 lib/bug.c:183
       fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177
       do_trap_no_signal arch/x86/kernel/traps.c:211 [inline]
       do_trap+0x260/0x390 arch/x86/kernel/traps.c:260
       do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297
       do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310
       invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
      RIP: 0010:refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
      RSP: 0018:ffff8801c606e3a0 EFLAGS: 00010282
      RAX: 0000000000000026 RBX: 0000000000001401 RCX: 0000000000000000
      RDX: 0000000000000026 RSI: ffffc900036fc000 RDI: ffffed0038c0dc68
      RBP: ffff8801c606e430 R08: 0000000000000001 R09: 0000000000000000
      R10: ffff8801d97f5eba R11: 0000000000000000 R12: ffff8801d5acf73c
      R13: 1ffff10038c0dc75 R14: 00000000ffffffff R15: 00000000fffff72f
       refcount_add+0x1b/0x60 lib/refcount.c:101
       tcp_gso_segment+0x10d0/0x16b0 net/ipv4/tcp_offload.c:155
       tcp4_gso_segment+0xd4/0x310 net/ipv4/tcp_offload.c:51
       inet_gso_segment+0x60c/0x11c0 net/ipv4/af_inet.c:1271
       skb_mac_gso_segment+0x33f/0x660 net/core/dev.c:2749
       __skb_gso_segment+0x35f/0x7f0 net/core/dev.c:2821
       skb_gso_segment include/linux/netdevice.h:3971 [inline]
       validate_xmit_skb+0x4ba/0xb20 net/core/dev.c:3074
       __dev_queue_xmit+0xe49/0x2070 net/core/dev.c:3497
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3538
       neigh_hh_output include/net/neighbour.h:471 [inline]
       neigh_output include/net/neighbour.h:479 [inline]
       ip_finish_output2+0xece/0x1460 net/ipv4/ip_output.c:229
       ip_finish_output+0x85e/0xd10 net/ipv4/ip_output.c:317
       NF_HOOK_COND include/linux/netfilter.h:238 [inline]
       ip_output+0x1cc/0x860 net/ipv4/ip_output.c:405
       dst_output include/net/dst.h:459 [inline]
       ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
       ip_queue_xmit+0x8c6/0x18e0 net/ipv4/ip_output.c:504
       tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1137
       tcp_write_xmit+0x663/0x4de0 net/ipv4/tcp_output.c:2341
       __tcp_push_pending_frames+0xa0/0x250 net/ipv4/tcp_output.c:2513
       tcp_push_pending_frames include/net/tcp.h:1722 [inline]
       tcp_data_snd_check net/ipv4/tcp_input.c:5050 [inline]
       tcp_rcv_established+0x8c7/0x18a0 net/ipv4/tcp_input.c:5497
       tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1460
       sk_backlog_rcv include/net/sock.h:909 [inline]
       __release_sock+0x124/0x360 net/core/sock.c:2264
       release_sock+0xa4/0x2a0 net/core/sock.c:2776
       tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462
       inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
       sock_sendmsg_nosec net/socket.c:632 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:642
       ___sys_sendmsg+0x31c/0x890 net/socket.c:2048
       __sys_sendmmsg+0x1e6/0x5f0 net/socket.c:2138
      
      Fixes: 14afee4b ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ec318fe
    • Stephane Grosjean's avatar
      can: peak: Add support for new PCIe/M2 CAN FD interfaces · 4cbdd0ee
      Stephane Grosjean authored
      This adds support for the following PEAK-System CAN FD interfaces:
      
      PCAN-cPCIe FD         CAN FD Interface for cPCI Serial (2 or 4 channels)
      PCAN-PCIe/104-Express CAN FD Interface for PCIe/104-Express (1, 2 or 4 ch.)
      PCAN-miniPCIe FD      CAN FD Interface for PCIe Mini (1, 2 or 4 channels)
      PCAN-PCIe FD OEM      CAN FD Interface for PCIe OEM version (1, 2 or 4 ch.)
      PCAN-M.2              CAN FD Interface for M.2 (1 or 2 channels)
      
      Like the PCAN-PCIe FD interface, all of these boards run the same IP Core
      that is able to handle CAN FD (see also http://www.peak-system.com).
      Signed-off-by: default avatarStephane Grosjean <s.grosjean@peak-system.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      4cbdd0ee
    • Gerhard Bertelsmann's avatar
      can: sun4i: handle overrun in RX FIFO · 4dcf924c
      Gerhard Bertelsmann authored
      SUN4Is CAN IP has a 64 byte deep FIFO buffer. If the buffer is not
      drained fast enough (overrun) it's getting mangled. Already received
      frames are dropped - the data can't be restored.
      Signed-off-by: default avatarGerhard Bertelsmann <info@gerhard-bertelsmann.de>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      4dcf924c
    • Richard Schütz's avatar
      can: c_can: don't indicate triple sampling support for D_CAN · fb5f0b3e
      Richard Schütz authored
      The D_CAN controller doesn't provide a triple sampling mode, so don't set
      the CAN_CTRLMODE_3_SAMPLES flag in ctrlmode_supported. Currently enabling
      triple sampling is a no-op.
      Signed-off-by: default avatarRichard Schütz <rschuetz@uni-koblenz.de>
      Cc: linux-stable <stable@vger.kernel.org> # >= v3.6
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      fb5f0b3e
    • Alexander Shishkin's avatar
      x86/debug: Handle warnings before the notifier chain, to fix KGDB crash · b8347c21
      Alexander Shishkin authored
      Commit:
      
        9a93848f ("x86/debug: Implement __WARN() using UD0")
      
      turned warnings into UD0, but the fixup code only runs after the
      notify_die() chain. This is a problem, in particular, with kgdb,
      which kicks in as if it was a BUG().
      
      Fix this by running the fixup code before the notifier chain in
      the invalid op handler path.
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Tested-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Acked-by: default avatarDaniel Thompson <daniel.thompson@linaro.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Weinberger <richard.weinberger@gmail.com>
      Cc: <stable@vger.kernel.org> # v4.12+
      Link: http://lkml.kernel.org/r/20170724100428.19173-1-alexander.shishkin@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b8347c21
    • Eugenia Emantayev's avatar
      net/mlx5e: Increase Striding RQ minimum size limit to 4 multi-packet WQEs · d1c61e6d
      Eugenia Emantayev authored
      This is to prevent the case of working with a single MPWQE
      (1 WQE is always reserved as RQ is linked-list).
      When the WQE is fully consumed, HW should still have available buffer
      in order not to drop packets.
      
      Fixes: 461017cb ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
      Signed-off-by: default avatarEugenia Emantayev <eugenia@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Cc: kernel-team@fb.com
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      d1c61e6d
    • Inbar Karmy's avatar
      net/mlx5e: Set page to null in case dma mapping fails · 2e50b261
      Inbar Karmy authored
      Currently, when dma mapping fails, put_page is called,
      but the page is not set to null. Later, in the page_reuse treatment in
      mlx5e_free_rx_descs(), mlx5e_page_release() is called for the second time,
      improperly doing dma_unmap (for a non-mapped address) and an extra put_page.
      Prevent this by nullifying the page pointer when dma_map fails.
      
      Fixes: accd5883 ("net/mlx5e: Introduce RX Page-Reuse")
      Signed-off-by: default avatarInbar Karmy <inbark@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Cc: kernel-team@fb.com
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      2e50b261
    • Saeed Mahameed's avatar
      net/mlx5e: Fix napi poll with zero budget · 2a8d6065
      Saeed Mahameed authored
      napi->poll can be called with budget 0, e.g. in netpoll scenarios
      where the caller only wants to poll TX rings
      (poll_one_napi@net/core/netpoll.c).
      
      The below commit changed RX polling from "while" loop to "do {} while",
      which caused to ignore the initial budget and handle at least one RX
      packet.
      
      This fixes the following warning:
      [ 2852.049194] mlx5e_napi_poll+0x0/0x260 [mlx5_core] exceeded budget in poll
      [ 2852.049195] ------------[ cut here ]------------
      [ 2852.049195] WARNING: CPU: 0 PID: 25691 at net/core/netpoll.c:171 netpoll_poll_dev+0x18a/0x1a0
      
      Fixes: 4b7dfc99 ("net/mlx5e: Early-return on empty completion queues")
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reported-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Tested-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: kernel-team@fb.com
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      2a8d6065
    • Huy Nguyen's avatar
      net/mlx5: Cancel health poll before sending panic teardown command · d2aa060d
      Huy Nguyen authored
      After the panic teardown firmware command, health_care detects the error
      in PCI bus and calls the mlx5_pci_err_detected. This health_care flow is
      no longer needed because the panic teardown firmware command will bring
      down the PCI bus communication with the HCA.
      
      The solution is to cancel the health care timer and its pending
      workqueue request before sending panic teardown firmware command.
      
      Kernel trace:
      mlx5_core 0033:01:00.0: Shutdown was called
      mlx5_core 0033:01:00.0: health_care:154:(pid 9304): handling bad device here
      mlx5_core 0033:01:00.0: mlx5_handle_bad_state:114:(pid 9304): NIC state 1
      mlx5_core 0033:01:00.0: mlx5_pci_err_detected was called
      mlx5_core 0033:01:00.0: mlx5_enter_error_state:96:(pid 9304): start
      mlx5_3:mlx5_ib_event:3061:(pid 9304): warning: event on port 0
      mlx5_core 0033:01:00.0: mlx5_enter_error_state:104:(pid 9304): end
      Unable to handle kernel paging request for data at address 0x0000003f
      Faulting instruction address: 0xc0080000434b8c80
      
      Fixes: 8812c24d ('net/mlx5: Add fast unload support in shutdown flow')
      Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      d2aa060d
    • Huy Nguyen's avatar
      net/mlx5: Loop over temp list to release delay events · b8cce68b
      Huy Nguyen authored
      list_splice_init initializing waiting_events_list after splicing it to
      temp list, therefore we should loop over temp list to fire the events.
      
      Fixes: 4ca637a2 ("net/mlx5: Delay events till mlx5 interface's add complete for pci resume")
      Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
      Signed-off-by: default avatarFeras Daoud <ferasda@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      b8cce68b
    • Håkon Bugge's avatar
      rds: ib: Fix NULL pointer dereference in debug code · 1cb483a5
      Håkon Bugge authored
      rds_ib_recv_refill() is a function that refills an IB receive
      queue. It can be called from both the CQE handler (tasklet) and a
      worker thread.
      
      Just after the call to ib_post_recv(), a debug message is printed with
      rdsdebug():
      
                  ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
                  rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
                           recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
                           (long) ib_sg_dma_address(
                                  ic->i_cm_id->device,
                                  &recv->r_frag->f_sg),
                          ret);
      
      Now consider an invocation of rds_ib_recv_refill() from the worker
      thread, which is preemptible. Further, assume that the worker thread
      is preempted between the ib_post_recv() and rdsdebug() statements.
      
      Then, if the preemption is due to a receive CQE event, the
      rds_ib_recv_cqe_handler() will be invoked. This function processes
      receive completions, including freeing up data structures, such as the
      recv->r_frag.
      
      In this scenario, rds_ib_recv_cqe_handler() will process the receive
      WR posted above. That implies, that the recv->r_frag has been freed
      before the above rdsdebug() statement has been executed. When it is
      later executed, we will have a NULL pointer dereference:
      
      [ 4088.068008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
      [ 4088.076754] IP: rds_ib_recv_refill+0x87/0x620 [rds_rdma]
      [ 4088.082686] PGD 0 P4D 0
      [ 4088.085515] Oops: 0000 [#1] SMP
      [ 4088.089015] Modules linked in: rds_rdma(OE) rds(OE) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) mlx4_ib(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_core(E) binfmt_misc(E) sb_edac(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) crypto_simd(E) iTCO_wdt(E) glue_helper(E) iTCO_vendor_support(E) sg(E) cryptd(E) pcspkr(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) shpchp(E) ioatdma(E) i2c_i801(E) wmi(E) lpc_ich(E) mei_me(E) mei(E) mfd_core(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) fscrypto(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
      [ 4088.168486]  fb_sys_fops(E) ahci(E) ixgbe(E) libahci(E) ttm(E) mdio(E) ptp(E) pps_core(E) drm(E) sd_mod(E) libata(E) crc32c_intel(E) mlx4_core(E) i2c_core(E) dca(E) megaraid_sas(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: rds]
      [ 4088.193442] CPU: 20 PID: 1244 Comm: kworker/20:2 Tainted: G           OE   4.14.0-rc7.master.20171105.ol7.x86_64 #1
      [ 4088.205097] Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
      [ 4088.216074] Workqueue: ib_cm cm_work_handler [ib_cm]
      [ 4088.221614] task: ffff885fa11d0000 task.stack: ffffc9000e598000
      [ 4088.228224] RIP: 0010:rds_ib_recv_refill+0x87/0x620 [rds_rdma]
      [ 4088.234736] RSP: 0018:ffffc9000e59bb68 EFLAGS: 00010286
      [ 4088.240568] RAX: 0000000000000000 RBX: ffffc9002115d050 RCX: ffffc9002115d050
      [ 4088.248535] RDX: ffffffffa0521380 RSI: ffffffffa0522158 RDI: ffffffffa0525580
      [ 4088.256498] RBP: ffffc9000e59bbf8 R08: 0000000000000005 R09: 0000000000000000
      [ 4088.264465] R10: 0000000000000339 R11: 0000000000000001 R12: 0000000000000000
      [ 4088.272433] R13: ffff885f8c9d8000 R14: ffffffff81a0a060 R15: ffff884676268000
      [ 4088.280397] FS:  0000000000000000(0000) GS:ffff885fbec80000(0000) knlGS:0000000000000000
      [ 4088.289434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 4088.295846] CR2: 0000000000000020 CR3: 0000000001e09005 CR4: 00000000001606e0
      [ 4088.303816] Call Trace:
      [ 4088.306557]  rds_ib_cm_connect_complete+0xe0/0x220 [rds_rdma]
      [ 4088.312982]  ? __dynamic_pr_debug+0x8c/0xb0
      [ 4088.317664]  ? __queue_work+0x142/0x3c0
      [ 4088.321944]  rds_rdma_cm_event_handler+0x19e/0x250 [rds_rdma]
      [ 4088.328370]  cma_ib_handler+0xcd/0x280 [rdma_cm]
      [ 4088.333522]  cm_process_work+0x25/0x120 [ib_cm]
      [ 4088.338580]  cm_work_handler+0xd6b/0x17aa [ib_cm]
      [ 4088.343832]  process_one_work+0x149/0x360
      [ 4088.348307]  worker_thread+0x4d/0x3e0
      [ 4088.352397]  kthread+0x109/0x140
      [ 4088.355996]  ? rescuer_thread+0x380/0x380
      [ 4088.360467]  ? kthread_park+0x60/0x60
      [ 4088.364563]  ret_from_fork+0x25/0x30
      [ 4088.368548] Code: 48 89 45 90 48 89 45 98 eb 4d 0f 1f 44 00 00 48 8b 43 08 48 89 d9 48 c7 c2 80 13 52 a0 48 c7 c6 58 21 52 a0 48 c7 c7 80 55 52 a0 <4c> 8b 48 20 44 89 64 24 08 48 8b 40 30 49 83 e1 fc 48 89 04 24
      [ 4088.389612] RIP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] RSP: ffffc9000e59bb68
      [ 4088.397772] CR2: 0000000000000020
      [ 4088.401505] ---[ end trace fe922e6ccf004431 ]---
      
      This bug was provoked by compiling rds out-of-tree with
      EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
      between the rdsdebug() and ib_ib_port_recv() statements:
      
         	       /* XXX when can this fail? */
      	       ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
      +		if (can_wait)
      +			usleep_range(1000, 5000);
      	       rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
      			recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
      			(long) ib_sg_dma_address(
      
      The fix is simply to move the rdsdebug() statement up before the
      ib_post_recv() and remove the printing of ret, which is taken care of
      anyway by the non-debug code.
      Signed-off-by: default avatarHåkon Bugge <haakon.bugge@oracle.com>
      Reviewed-by: default avatarKnut Omang <knut.omang@oracle.com>
      Reviewed-by: default avatarWei Lin Guay <wei.lin.guay@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1cb483a5
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 1c9dbd46
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "2 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        MAINTAINERS: update TPM driver infrastructure changes
        sysctl: add register_sysctl() dummy helper
      1c9dbd46