1. 05 Sep, 2020 7 commits
    • Joerg Roedel's avatar
      mm: track page table modifications in __apply_to_page_range() · e80d3909
      Joerg Roedel authored
      __apply_to_page_range() is also used to change and/or allocate
      page-table pages in the vmalloc area of the address space.  Make sure
      these changes get synchronized to other page-tables in the system by
      calling arch_sync_kernel_mappings() when necessary.
      
      The impact appears limited to x86-32, where apply_to_page_range may miss
      updating the PMD.  That leads to explosions in drivers like
      
        BUG: unable to handle page fault for address: fe036000
        #PF: supervisor write access in kernel mode
        #PF: error_code(0x0002) - not-present page
        *pde = 00000000
        Oops: 0002 [#1] SMP
        CPU: 3 PID: 1300 Comm: gem_concurrent_ Not tainted 5.9.0-rc1+ #16
        Hardware name:  /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015
        EIP: __execlists_context_alloc+0x132/0x2d0 [i915]
        Code: 31 d2 89 f0 e8 2f 55 02 00 89 45 e8 3d 00 f0 ff ff 0f 87 11 01 00 00 8b 4d e8 03 4b 30 b8 5a 5a 5a 5a ba 01 00 00 00 8d 79 04 <c7> 01 5a 5a 5a 5a c7 81 fc 0f 00 00 5a 5a 5a 5a 83 e7 fc 29 f9 81
        EAX: 5a5a5a5a EBX: f60ca000 ECX: fe036000 EDX: 00000001
        ESI: f43b7340 EDI: fe036004 EBP: f6389cb8 ESP: f6389c9c
        DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010286
        CR0: 80050033 CR2: fe036000 CR3: 2d361000 CR4: 001506d0
        DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
        DR6: fffe0ff0 DR7: 00000400
        Call Trace:
          execlists_context_alloc+0x10/0x20 [i915]
          intel_context_alloc_state+0x3f/0x70 [i915]
          __intel_context_do_pin+0x117/0x170 [i915]
          i915_gem_do_execbuffer+0xcc7/0x2500 [i915]
          i915_gem_execbuffer2_ioctl+0xcd/0x1f0 [i915]
          drm_ioctl_kernel+0x8f/0xd0
          drm_ioctl+0x223/0x3d0
          __ia32_sys_ioctl+0x1ab/0x760
          __do_fast_syscall_32+0x3f/0x70
          do_fast_syscall_32+0x29/0x60
          do_SYSENTER_32+0x15/0x20
          entry_SYSENTER_32+0x9f/0xf2
        EIP: 0xb7f28559
        Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
        EAX: ffffffda EBX: 00000005 ECX: c0406469 EDX: bf95556c
        ESI: b7e68000 EDI: c0406469 EBP: 00000005 ESP: bf9554d8
        DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296
        Modules linked in: i915 x86_pkg_temp_thermal intel_powerclamp crc32_pclmul crc32c_intel intel_cstate intel_uncore intel_gtt drm_kms_helper intel_pch_thermal video button autofs4 i2c_i801 i2c_smbus fan
        CR2: 00000000fe036000
      
      It looks like kasan, xen and i915 are vulnerable.
      
      Actual impact is "on thinkpad X60 in 5.9-rc1, screen starts blinking
      after 30-or-so minutes, and machine is unusable"
      
      [sfr@canb.auug.org.au: ARCH_PAGE_TABLE_SYNC_MASK needs vmalloc.h]
        Link: https://lkml.kernel.org/r/20200825172508.16800a4f@canb.auug.org.au
      [chris@chris-wilson.co.uk: changelog addition]
      [pavel@ucw.cz: changelog addition]
      
      Fixes: 2ba3e694 ("mm/vmalloc: track which page-table levels were modified")
      Fixes: 86cf69f1 ("x86/mm/32: implement arch_sync_kernel_mappings()")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: Chris Wilson <chris@chris-wilson.co.uk>	[x86-32]
      Tested-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: <stable@vger.kernel.org>	[5.8+]
      Link: https://lkml.kernel.org/r/20200821123746.16904-1-joro@8bytes.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e80d3909
    • Randy Dunlap's avatar
      MAINTAINERS: IA64: mark Status as Odd Fixes only · 9d90dd18
      Randy Dunlap authored
      IA64 isn't really being maintained, so mark it as Odd Fixes only.
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Link: http://lkml.kernel.org/r/7e719139-450f-52c2-59a2-7964a34eda1f@infradead.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d90dd18
    • Nick Desaulniers's avatar
      MAINTAINERS: add LLVM maintainers · b9644289
      Nick Desaulniers authored
      Nominate Nathan and myself to be point of contact for clang/LLVM related
      support, after a poll at the LLVM BoF at Linux Plumbers Conf 2020.
      
      While corporate sponsorship is beneficial, its important to not entrust
      the keys to the nukes with any one entity.  Should Nathan and I find
      ourselves at the same employer, I would gladly step down.
      Signed-off-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Acked-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Acked-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Acked-by: default avatarMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Acked-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Link: https://lkml.kernel.org/r/20200825143540.2948637-1-ndesaulniers@google.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9644289
    • Robert Richter's avatar
      MAINTAINERS: update Cavium/Marvell entries · f548a645
      Robert Richter authored
      I am leaving Marvell and already do not have access to my @marvell.com
      email address.  So switching over to my korg mail address or removing my
      address there another maintainer is already listed.  For the entries
      there no other maintainer is listed I will keep looking into patches for
      Cavium systems for a while until someone from Marvell takes it over.
      
      Since I might have limited access to hardware and also limited time I
      changed state to 'Odd Fixes' for those entries.
      Signed-off-by: default avatarRobert Richter <rric@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Ganapatrao Kulkarni <gkulkarni@marvell.com>
      Cc: Sunil Goutham <sgoutham@marvell.com>
      CC: Borislav Petkov <bp@alien8.de>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Wolfram Sang <wsa@kernel.org>,
      Link: https://lkml.kernel.org/r/20200824122050.31164-1-rric@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f548a645
    • Eugeniu Rosca's avatar
      mm: slub: fix conversion of freelist_corrupted() · dc07a728
      Eugeniu Rosca authored
      Commit 52f23478 ("mm/slub.c: fix corrupted freechain in
      deactivate_slab()") suffered an update when picked up from LKML [1].
      
      Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()'
      created a no-op statement.  Fix it by sticking to the behavior intended
      in the original patch [1].  In addition, make freelist_corrupted()
      immune to passing NULL instead of &freelist.
      
      The issue has been spotted via static analysis and code review.
      
      [1] https://lore.kernel.org/linux-mm/20200331031450.12182-1-dongli.zhang@oracle.com/
      
      Fixes: 52f23478 ("mm/slub.c: fix corrupted freechain in deactivate_slab()")
      Signed-off-by: default avatarEugeniu Rosca <erosca@de.adit-jv.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Dongli Zhang <dongli.zhang@oracle.com>
      Cc: Joe Jin <joe.jin@oracle.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20200824130643.10291-1-erosca@de.adit-jv.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc07a728
    • Xunlei Pang's avatar
      mm: memcg: fix memcg reclaim soft lockup · e3336cab
      Xunlei Pang authored
      We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg
      doesn't have any reclaimable memory.
      
      It can be easily reproduced as below:
      
        watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204]
        CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12
        Call Trace:
          shrink_lruvec+0x49f/0x640
          shrink_node+0x2a6/0x6f0
          do_try_to_free_pages+0xe9/0x3e0
          try_to_free_mem_cgroup_pages+0xef/0x1f0
          try_charge+0x2c1/0x750
          mem_cgroup_charge+0xd7/0x240
          __add_to_page_cache_locked+0x2fd/0x370
          add_to_page_cache_lru+0x4a/0xc0
          pagecache_get_page+0x10b/0x2f0
          filemap_fault+0x661/0xad0
          ext4_filemap_fault+0x2c/0x40
          __do_fault+0x4d/0xf9
          handle_mm_fault+0x1080/0x1790
      
      It only happens on our 1-vcpu instances, because there's no chance for
      oom reaper to run to reclaim the to-be-killed process.
      
      Add a cond_resched() at the upper shrink_node_memcgs() to solve this
      issue, this will mean that we will get a scheduling point for each memcg
      in the reclaimed hierarchy without any dependency on the reclaimable
      memory in that memcg thus making it more predictable.
      Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarXunlei Pang <xlpang@linux.alibaba.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarChris Down <chris@chrisdown.name>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Link: http://lkml.kernel.org/r/1598495549-67324-1-git-send-email-xlpang@linux.alibaba.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e3336cab
    • Michal Hocko's avatar
      memcg: fix use-after-free in uncharge_batch · f1796544
      Michal Hocko authored
      syzbot has reported an use-after-free in the uncharge_batch path
      
        BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline]
        BUG: KASAN: use-after-free in atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
        BUG: KASAN: use-after-free in atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
        BUG: KASAN: use-after-free in page_counter_cancel mm/page_counter.c:54 [inline]
        BUG: KASAN: use-after-free in page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
        Write of size 8 at addr ffff8880371c0148 by task syz-executor.0/9304
      
        CPU: 0 PID: 9304 Comm: syz-executor.0 Not tainted 5.8.0-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Call Trace:
          __dump_stack lib/dump_stack.c:77 [inline]
          dump_stack+0x1f0/0x31e lib/dump_stack.c:118
          print_address_description+0x66/0x620 mm/kasan/report.c:383
          __kasan_report mm/kasan/report.c:513 [inline]
          kasan_report+0x132/0x1d0 mm/kasan/report.c:530
          check_memory_region_inline mm/kasan/generic.c:183 [inline]
          check_memory_region+0x2b5/0x2f0 mm/kasan/generic.c:192
          instrument_atomic_write include/linux/instrumented.h:71 [inline]
          atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline]
          atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline]
          page_counter_cancel mm/page_counter.c:54 [inline]
          page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155
          uncharge_batch+0x6c/0x350 mm/memcontrol.c:6764
          uncharge_page+0x115/0x430 mm/memcontrol.c:6796
          uncharge_list mm/memcontrol.c:6835 [inline]
          mem_cgroup_uncharge_list+0x70/0xe0 mm/memcontrol.c:6877
          release_pages+0x13a2/0x1550 mm/swap.c:911
          tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
          tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
          tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249
          tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:328
          exit_mmap+0x296/0x550 mm/mmap.c:3185
          __mmput+0x113/0x370 kernel/fork.c:1076
          exit_mm+0x4cd/0x550 kernel/exit.c:483
          do_exit+0x576/0x1f20 kernel/exit.c:793
          do_group_exit+0x161/0x2d0 kernel/exit.c:903
          get_signal+0x139b/0x1d30 kernel/signal.c:2743
          arch_do_signal+0x33/0x610 arch/x86/kernel/signal.c:811
          exit_to_user_mode_loop kernel/entry/common.c:135 [inline]
          exit_to_user_mode_prepare+0x8d/0x1b0 kernel/entry/common.c:166
          syscall_exit_to_user_mode+0x5e/0x1a0 kernel/entry/common.c:241
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Commit 1a3e1f40 ("mm: memcontrol: decouple reference counting from
      page accounting") reworked the memcg lifetime to be bound the the struct
      page rather than charges.  It also removed the css_put_many from
      uncharge_batch and that is causing the above splat.
      
      uncharge_batch() is supposed to uncharge accumulated charges for all
      pages freed from the same memcg.  The queuing is done by uncharge_page
      which however drops the memcg reference after it adds charges to the
      batch.  If the current page happens to be the last one holding the
      reference for its memcg then the memcg is OK to go and the next page to
      be freed will trigger batched uncharge which needs to access the memcg
      which is gone already.
      
      Fix the issue by taking a reference for the memcg in the current batch.
      
      Fixes: 1a3e1f40 ("mm: memcontrol: decouple reference counting from page accounting")
      Reported-by: syzbot+b305848212deec86eabe@syzkaller.appspotmail.com
      Reported-by: syzbot+b5ea6fb6f139c8b9482b@syzkaller.appspotmail.com
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Hugh Dickins <hughd@google.com>
      Link: https://lkml.kernel.org/r/20200820090341.GC5033@dhcp22.suse.czSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f1796544
  2. 04 Sep, 2020 5 commits
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v5.9-2020-09-03' of... · 59126901
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v5.9-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Use uintptr_t when casting numbers to pointers
      
       - Keep output expected by 3rd parties: Turn off summary for interval
         mode by default.
      
       - BPF is in kernel space, make sure do_validate_kcore_modules() knows
         about that.
      
       - Explicitly call out event modifiers in the documentation.
      
       - Fix jevents() allocation of space for regular expressions.
      
       - Address libtraceevent build warnings on 32-bit arches.
      
       - Fix checking of functions returns using ERR_PTR() in 'perf bench'.
      
      * tag 'perf-tools-fixes-for-v5.9-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf tools: Add bpf image check to __map__is_kmodule
        perf record/stat: Explicitly call out event modifiers in the documentation
        perf bench: The do_run_multi_threaded() function must use IS_ERR(perf_session__new())
        perf stat: Turn off summary for interval mode by default
        libtraceevent: Fix build warning on 32-bit arches
        perf jevents: Fix suspicious code in fixregex()
        perf parse-events: Use uintptr_t when casting numbers to pointers
      59126901
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 3e8d3bdc
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi
          Kivilinna.
      
       2) Fix loss of RTT samples in rxrpc, from David Howells.
      
       3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu.
      
       4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka.
      
       5) We disable BH for too lokng in sctp_get_port_local(), add a
          cond_resched() here as well, from Xin Long.
      
       6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu.
      
       7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from
          Yonghong Song.
      
       8) Missing of_node_put() in mt7530 DSA driver, from Sumera
          Priyadarsini.
      
       9) Fix crash in bnxt_fw_reset_task(), from Michael Chan.
      
      10) Fix geneve tunnel checksumming bug in hns3, from Yi Li.
      
      11) Memory leak in rxkad_verify_response, from Dinghao Liu.
      
      12) In tipc, don't use smp_processor_id() in preemptible context. From
          Tuong Lien.
      
      13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu.
      
      14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter.
      
      15) Fix ABI mismatch between driver and firmware in nfp, from Louis
          Peens.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits)
        net/smc: fix sock refcounting in case of termination
        net/smc: reset sndbuf_desc if freed
        net/smc: set rx_off for SMCR explicitly
        net/smc: fix toleration of fake add_link messages
        tg3: Fix soft lockup when tg3_reset_task() fails.
        doc: net: dsa: Fix typo in config code sample
        net: dp83867: Fix WoL SecureOn password
        nfp: flower: fix ABI mismatch between driver and firmware
        tipc: fix shutdown() of connectionless socket
        ipv6: Fix sysctl max for fib_multipath_hash_policy
        drivers/net/wan/hdlc: Change the default of hard_header_len to 0
        net: gemini: Fix another missing clk_disable_unprepare() in probe
        net: bcmgenet: fix mask check in bcmgenet_validate_flow()
        amd-xgbe: Add support for new port mode
        net: usb: dm9601: Add USB ID of Keenetic Plus DSL
        vhost: fix typo in error message
        net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()
        pktgen: fix error message with wrong function name
        net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode
        cxgb4: fix thermal zone device registration
        ...
      3e8d3bdc
    • Linus Torvalds's avatar
      Merge branch 'gate-page-refcount' (patches from Dave Hansen) · 8381979d
      Linus Torvalds authored
      Merge gate page refcount fix from Dave Hansen:
       "During the conversion over to pin_user_pages(), gate pages were missed.
      
        The fix is pretty simple, and is accompanied by a new test from Andy
        which probably would have caught this earlier"
      
      * emailed patches from Dave Hansen <dave.hansen@linux.intel.com>:
        selftests/x86/test_vsyscall: Improve the process_vm_readv() test
        mm: fix pin vs. gup mismatch with gate pages
      8381979d
    • Andy Lutomirski's avatar
      selftests/x86/test_vsyscall: Improve the process_vm_readv() test · 8891adc6
      Andy Lutomirski authored
      The existing code accepted process_vm_readv() success or failure as long
      as it didn't return garbage.  This is too weak: if the vsyscall page is
      readable, then process_vm_readv() should succeed and, if the page is not
      readable, then it should fail.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: x86@kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8891adc6
    • Dave Hansen's avatar
      mm: fix pin vs. gup mismatch with gate pages · 9fa2dd94
      Dave Hansen authored
      Gate pages were missed when converting from get to pin_user_pages().
      This can lead to refcount imbalances.  This is reliably and quickly
      reproducible running the x86 selftests when vsyscall=emulate is enabled
      (the default).  Fix by using try_grab_page() with appropriate flags
      passed.
      
      The long story:
      
      Today, pin_user_pages() and get_user_pages() are similar interfaces for
      manipulating page reference counts.  However, "pins" use a "bias" value
      and manipulate the actual reference count by 1024 instead of 1 used by
      plain "gets".
      
      That means that pin_user_pages() must be matched with unpin_user_pages()
      and can't be mixed with a plain put_user_pages() or put_page().
      
      Enter gate pages, like the vsyscall page.  They are pages usually in the
      kernel image, but which are mapped to userspace.  Userspace is allowed
      access to them, including interfaces using get/pin_user_pages().  The
      refcount of these kernel pages is manipulated just like a normal user
      page on the get/pin side so that the put/unpin side can work the same
      for normal user pages or gate pages.
      
      get_gate_page() uses try_get_page() which only bumps the refcount by
      1, not 1024, even if called in the pin_user_pages() path.  If someone
      pins a gate page, this happens:
      
      	pin_user_pages()
      		get_gate_page()
      			try_get_page() // bump refcount +1
      	... some time later
      	unpin_user_pages()
      		page_ref_sub_and_test(page, 1024))
      
      ... and boom, we get a refcount off by 1023.  This is reliably and
      quickly reproducible running the x86 selftests when booted with
      vsyscall=emulate (the default).  The selftests use ptrace(), but I
      suspect anything using pin_user_pages() on gate pages could hit this.
      
      To fix it, simply use try_grab_page() instead of try_get_page(), and
      pass 'gup_flags' in so that FOLL_PIN can be respected.
      
      This bug traces back to the very beginning of the FOLL_PIN support in
      commit 3faa52c0 ("mm/gup: track FOLL_PIN pages"), which showed up in
      the 5.7 release.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Fixes: 3faa52c0 ("mm/gup: track FOLL_PIN pages")
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: x86@kernel.org
      Cc: Jann Horn <jannh@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9fa2dd94
  3. 03 Sep, 2020 17 commits
    • David S. Miller's avatar
      Merge branch 'smc-fixes' · b61ac5bb
      David S. Miller authored
      Karsten Graul says:
      
      ====================
      net/smc: fixes 2020-09-03
      
      Please apply the following patch series for smc to netdev's net tree.
      
      Patch 1 fixes the toleration of older SMC implementations. Patch 2
      takes care of a problem that happens when SMCR is used after SMCD
      initialization failed. Patch 3 fixes a problem with freed send buffers,
      and patch 4 corrects refcounting when SMC terminates due to device
      removal.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b61ac5bb
    • Ursula Braun's avatar
      net/smc: fix sock refcounting in case of termination · 5fb8642a
      Ursula Braun authored
      When an ISM device is removed, all its linkgroups are terminated,
      i.e. all the corresponding connections are killed.
      Connection killing invokes smc_close_active_abort(), which decreases
      the sock refcount for certain states to simulate passive closing.
      And it cancels the close worker and has to give up the sock lock for
      this timeframe. This opens the door for a passive close worker or a
      socket close to run in between. In this case smc_close_active_abort() and
      passive close worker resp. smc_release() might do a sock_put for passive
      closing. This causes:
      
      [ 1323.315943] refcount_t: underflow; use-after-free.
      [ 1323.316055] WARNING: CPU: 3 PID: 54469 at lib/refcount.c:28 refcount_warn_saturate+0xe8/0x130
      [ 1323.316069] Kernel panic - not syncing: panic_on_warn set ...
      [ 1323.316084] CPU: 3 PID: 54469 Comm: uperf Not tainted 5.9.0-20200826.rc2.git0.46328853ed20.300.fc32.s390x+debug #1
      [ 1323.316096] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
      [ 1323.316108] Call Trace:
      [ 1323.316125]  [<00000000c0d4aae8>] show_stack+0x90/0xf8
      [ 1323.316143]  [<00000000c15989b0>] dump_stack+0xa8/0xe8
      [ 1323.316158]  [<00000000c0d8344e>] panic+0x11e/0x288
      [ 1323.316173]  [<00000000c0d83144>] __warn+0xac/0x158
      [ 1323.316187]  [<00000000c1597a7a>] report_bug+0xb2/0x130
      [ 1323.316201]  [<00000000c0d36424>] monitor_event_exception+0x44/0xc0
      [ 1323.316219]  [<00000000c195c716>] pgm_check_handler+0x1da/0x238
      [ 1323.316234]  [<00000000c151844c>] refcount_warn_saturate+0xec/0x130
      [ 1323.316280] ([<00000000c1518448>] refcount_warn_saturate+0xe8/0x130)
      [ 1323.316310]  [<000003ff801f2e2a>] smc_release+0x192/0x1c8 [smc]
      [ 1323.316323]  [<00000000c169f1fa>] __sock_release+0x5a/0xe0
      [ 1323.316334]  [<00000000c169f2ac>] sock_close+0x2c/0x40
      [ 1323.316350]  [<00000000c1086de0>] __fput+0xb8/0x278
      [ 1323.316362]  [<00000000c0db1e0e>] task_work_run+0x76/0xb8
      [ 1323.316393]  [<00000000c0d8ab84>] do_exit+0x26c/0x520
      [ 1323.316408]  [<00000000c0d8af08>] do_group_exit+0x48/0xc0
      [ 1323.316421]  [<00000000c0d8afa8>] __s390x_sys_exit_group+0x28/0x38
      [ 1323.316433]  [<00000000c195c32c>] system_call+0xe0/0x2b4
      [ 1323.316446] 1 lock held by uperf/54469:
      [ 1323.316456]  #0: 0000000044125e60 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: __sock_release+0x44/0xe0
      
      The patch rechecks sock state in smc_close_active_abort() after
      smc_close_cancel_work() to avoid duplicate decrease of sock
      refcount for the same purpose.
      
      Fixes: 611b63a1 ("net/smc: cancel tx worker in case of socket aborts")
      Reviewed-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5fb8642a
    • Ursula Braun's avatar
      net/smc: reset sndbuf_desc if freed · 1d8df41d
      Ursula Braun authored
      When an SMC connection is created, and there is a problem to
      create an RMB or DMB, the previously created send buffer is
      thrown away as well including buffer descriptor freeing.
      Make sure the connection no longer references the freed
      buffer descriptor, otherwise bugs like this are possible:
      
      [71556.835148] =============================================================================
      [71556.835168] BUG kmalloc-128 (Tainted: G    B      OE    ): Poison overwritten
      [71556.835172] -----------------------------------------------------------------------------
      
      [71556.835179] INFO: 0x00000000d20894be-0x00000000aaef63e9 @offset=2724. First byte 0x0 instead of 0x6b
      [71556.835215] INFO: Allocated in __smc_buf_create+0x184/0x578 [smc] age=0 cpu=5 pid=46726
      [71556.835234]     ___slab_alloc+0x5a4/0x690
      [71556.835239]     __slab_alloc.constprop.0+0x70/0xb0
      [71556.835243]     kmem_cache_alloc_trace+0x38e/0x3f8
      [71556.835250]     __smc_buf_create+0x184/0x578 [smc]
      [71556.835257]     smc_buf_create+0x2e/0xe8 [smc]
      [71556.835264]     smc_listen_work+0x516/0x6a0 [smc]
      [71556.835275]     process_one_work+0x280/0x478
      [71556.835280]     worker_thread+0x66/0x368
      [71556.835287]     kthread+0x17a/0x1a0
      [71556.835294]     ret_from_fork+0x28/0x2c
      [71556.835301] INFO: Freed in smc_buf_create+0xd8/0xe8 [smc] age=0 cpu=5 pid=46726
      [71556.835307]     __slab_free+0x246/0x560
      [71556.835311]     kfree+0x398/0x3f8
      [71556.835318]     smc_buf_create+0xd8/0xe8 [smc]
      [71556.835324]     smc_listen_work+0x516/0x6a0 [smc]
      [71556.835328]     process_one_work+0x280/0x478
      [71556.835332]     worker_thread+0x66/0x368
      [71556.835337]     kthread+0x17a/0x1a0
      [71556.835344]     ret_from_fork+0x28/0x2c
      [71556.835348] INFO: Slab 0x00000000a0744551 objects=51 used=51 fp=0x0000000000000000 flags=0x1ffff00000010200
      [71556.835352] INFO: Object 0x00000000563480a1 @offset=2688 fp=0x00000000289567b2
      
      [71556.835359] Redzone 000000006783cde2: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
      [71556.835363] Redzone 00000000e35b876e: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
      [71556.835367] Redzone 0000000023074562: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
      [71556.835372] Redzone 00000000b9564b8c: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
      [71556.835376] Redzone 00000000810c6362: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
      [71556.835380] Redzone 0000000065ef52c3: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
      [71556.835384] Redzone 00000000c5dd6984: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
      [71556.835388] Redzone 000000004c480f8f: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
      [71556.835392] Object 00000000563480a1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      [71556.835397] Object 000000009c479d06: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      [71556.835401] Object 000000006e1dce92: 6b 6b 6b 6b 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b  kkkk....kkkkkkkk
      [71556.835405] Object 00000000227f7cf8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      [71556.835410] Object 000000009a701215: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      [71556.835414] Object 000000003731ce76: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      [71556.835418] Object 00000000f7085967: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      [71556.835422] Object 0000000007f99927: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
      [71556.835427] Redzone 00000000579c4913: bb bb bb bb bb bb bb bb                          ........
      [71556.835431] Padding 00000000305aef82: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
      [71556.835435] Padding 00000000b1cdd722: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
      [71556.835438] Padding 00000000c7568199: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
      [71556.835442] Padding 00000000fad4c4d4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
      [71556.835451] CPU: 0 PID: 47939 Comm: kworker/0:15 Tainted: G    B      OE     5.9.0-rc1uschi+ #54
      [71556.835456] Hardware name: IBM 3906 M03 703 (LPAR)
      [71556.835464] Workqueue: events smc_listen_work [smc]
      [71556.835470] Call Trace:
      [71556.835478]  [<00000000d5eaeb10>] show_stack+0x90/0xf8
      [71556.835493]  [<00000000d66fc0f8>] dump_stack+0xa8/0xe8
      [71556.835499]  [<00000000d61a511c>] check_bytes_and_report+0x104/0x130
      [71556.835504]  [<00000000d61a57b2>] check_object+0x26a/0x2e0
      [71556.835509]  [<00000000d61a59bc>] alloc_debug_processing+0x194/0x238
      [71556.835514]  [<00000000d61a8c14>] ___slab_alloc+0x5a4/0x690
      [71556.835519]  [<00000000d61a9170>] __slab_alloc.constprop.0+0x70/0xb0
      [71556.835524]  [<00000000d61aaf66>] kmem_cache_alloc_trace+0x38e/0x3f8
      [71556.835530]  [<000003ff80549bbc>] __smc_buf_create+0x184/0x578 [smc]
      [71556.835538]  [<000003ff8054a396>] smc_buf_create+0x2e/0xe8 [smc]
      [71556.835545]  [<000003ff80540c16>] smc_listen_work+0x516/0x6a0 [smc]
      [71556.835549]  [<00000000d5f0f448>] process_one_work+0x280/0x478
      [71556.835554]  [<00000000d5f0f6a6>] worker_thread+0x66/0x368
      [71556.835559]  [<00000000d5f18692>] kthread+0x17a/0x1a0
      [71556.835563]  [<00000000d6abf3b8>] ret_from_fork+0x28/0x2c
      [71556.835569] INFO: lockdep is turned off.
      [71556.835573] FIX kmalloc-128: Restoring 0x00000000d20894be-0x00000000aaef63e9=0x6b
      
      [71556.835577] FIX kmalloc-128: Marking all objects used
      
      Fixes: fd7f3a74 ("net/smc: remove freed buffer from list")
      Reviewed-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d8df41d
    • Ursula Braun's avatar
      net/smc: set rx_off for SMCR explicitly · 2d2bfeb8
      Ursula Braun authored
      SMC tries to make use of SMCD first. If a problem shows up,
      it tries to switch to SMCR. If the SMCD initializing problem shows
      up after the SMCD connection has already been initialized, field
      rx_off keeps the wrong SMCD value for SMCR, which results in corrupted
      data at the receiver.
      This patch adds an explicit (re-)setting of field rx_off to zero if the
      connection uses SMCR.
      
      Fixes: be244f28 ("net/smc: add SMC-D support in data transfer")
      Reviewed-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d2bfeb8
    • Karsten Graul's avatar
      net/smc: fix toleration of fake add_link messages · fffe83c8
      Karsten Graul authored
      Older SMCR implementations had no link failover support and used one
      link only. Because the handshake protocol requires to try the
      establishment of a second link the old code sent a fake add_link message
      and declined any server response afterwards.
      The current code supports multiple links and inspects the received fake
      add_link message more closely. To tolerate the fake add_link messages
      smc_llc_is_local_add_link() needs an improved check of the message to
      be able to separate between locally enqueued and fake add_link messages.
      And smc_llc_cli_add_link() needs to check if the provided qp_mtu size is
      invalid and reject the add_link request in that case.
      
      Fixes: c48254fa ("net/smc: move add link processing for new device into llc layer")
      Reviewed-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fffe83c8
    • Michael Chan's avatar
      tg3: Fix soft lockup when tg3_reset_task() fails. · 55669934
      Michael Chan authored
      If tg3_reset_task() fails, the device state is left in an inconsistent
      state with IFF_RUNNING still set but NAPI state not enabled.  A
      subsequent operation, such as ifdown or AER error can cause it to
      soft lock up when it tries to disable NAPI state.
      
      Fix it by bringing down the device to !IFF_RUNNING state when
      tg3_reset_task() fails.  tg3_reset_task() running from workqueue
      will now call tg3_close() when the reset fails.  We need to
      modify tg3_reset_task_cancel() slightly to avoid tg3_close()
      calling cancel_work_sync() to cancel tg3_reset_task().  Otherwise
      cancel_work_sync() will wait forever for tg3_reset_task() to
      finish.
      Reported-by: default avatarDavid Christensen <drc@linux.vnet.ibm.com>
      Reported-by: default avatarBaptiste Covolato <baptiste@arista.com>
      Fixes: db219973 ("tg3: Schedule at most one tg3_reset_task run")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55669934
    • Jiri Olsa's avatar
      perf tools: Add bpf image check to __map__is_kmodule · 830fadfd
      Jiri Olsa authored
      When validating kcore modules the do_validate_kcore_modules function
      checks on every kernel module dso against modules record. The
      __map__is_kmodule check is used to get only kernel module dso objects
      through.
      
      Currently the bpf images are slipping through the check and making the
      validation to fail, so report falls back from kcore usage to kallsyms.
      
      Adding __map__is_bpf_image check for bpf image and adding it to
      __map__is_kmodule check.
      
      Fixes: 3c29d448 ("perf annotate: Add basic support for bpf_image")
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200826213017.818788-1-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      830fadfd
    • Kim Phillips's avatar
      perf record/stat: Explicitly call out event modifiers in the documentation · e48a73a3
      Kim Phillips authored
      Event modifiers are not mentioned in the perf record or perf stat
      manpages.  Add them to orient new users more effectively by pointing
      them to the perf list manpage for details.
      
      Fixes: 2055fdaf ("perf list: Document precise event sampling for AMD IBS")
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tony Jones <tonyj@suse.de>
      Cc: stable@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200901215853.276234-1-kim.phillips@amd.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e48a73a3
    • YueHaibing's avatar
      perf bench: The do_run_multi_threaded() function must use IS_ERR(perf_session__new()) · e4d71f79
      YueHaibing authored
      In case of error, the function perf_session__new() returns ERR_PTR() and
      never returns NULL. The NULL test in the return value check should be
      replaced with IS_ERR()
      
      Committer notes:
      
      This wasn't compiling due to an extraneous '{' not matched by a '}', fix
      it.
      
      Fixes: 13edc237 ("perf bench: Add a multi-threaded synthesize benchmark")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200902140526.26916-1-yuehaibing@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e4d71f79
    • Jin Yao's avatar
      perf stat: Turn off summary for interval mode by default · ee6a9614
      Jin Yao authored
      There's a risk that outputting interval mode summaries by default breaks
      CSV consumers. It already broke pmu-tools/toplev.
      
      So now we turn off the summary by default but we create a new option
      '--summary' to enable the summary. This is active even when not using
      CSV mode.
      
      Before:
      
        root@kbl-ppc:~# perf stat -I1000 --interval-count 2
        #           time             counts unit events
             1.000265904           8,005.73 msec cpu-clock                 #    8.006 CPUs utilized
             1.000265904                601      context-switches          #    0.075 K/sec
             1.000265904                 10      cpu-migrations            #    0.001 K/sec
             1.000265904                  0      page-faults               #    0.000 K/sec
             1.000265904         66,746,521      cycles                    #    0.008 GHz
             1.000265904         71,874,398      instructions              #    1.08  insn per cycle
             1.000265904         13,356,781      branches                  #    1.668 M/sec
             1.000265904            298,756      branch-misses             #    2.24% of all branches
             2.001857667           8,012.52 msec cpu-clock                 #    8.013 CPUs utilized
             2.001857667                164      context-switches          #    0.020 K/sec
             2.001857667                 10      cpu-migrations            #    0.001 K/sec
             2.001857667                  2      page-faults               #    0.000 K/sec
             2.001857667          5,822,188      cycles                    #    0.001 GHz
             2.001857667          2,186,170      instructions              #    0.38  insn per cycle
             2.001857667            442,378      branches                  #    0.055 M/sec
             2.001857667             44,750      branch-misses             #   10.12% of all branches
      
         Performance counter stats for 'system wide':
      
                 16,018.25 msec cpu-clock                 #    7.993 CPUs utilized
                       765      context-switches          #    0.048 K/sec
                        20      cpu-migrations            #    0.001 K/sec
                         2      page-faults               #    0.000 K/sec
                72,568,709      cycles                    #    0.005 GHz
                74,060,568      instructions              #    1.02  insn per cycle
                13,799,159      branches                  #    0.861 M/sec
                   343,506      branch-misses             #    2.49% of all branches
      
               2.004118489 seconds time elapsed
      
      After:
      
        root@kbl-ppc:~# perf stat -I1000 --interval-count 2
        #           time             counts unit events
             1.001336393           8,013.28 msec cpu-clock                 #    8.013 CPUs utilized
             1.001336393                 82      context-switches          #    0.010 K/sec
             1.001336393                  8      cpu-migrations            #    0.001 K/sec
             1.001336393                  0      page-faults               #    0.000 K/sec
             1.001336393          4,199,121      cycles                    #    0.001 GHz
             1.001336393          1,373,991      instructions              #    0.33  insn per cycle
             1.001336393            270,681      branches                  #    0.034 M/sec
             1.001336393             31,659      branch-misses             #   11.70% of all branches
             2.003905006           8,020.52 msec cpu-clock                 #    8.021 CPUs utilized
             2.003905006                184      context-switches          #    0.023 K/sec
             2.003905006                  8      cpu-migrations            #    0.001 K/sec
             2.003905006                  2      page-faults               #    0.000 K/sec
             2.003905006          5,446,190      cycles                    #    0.001 GHz
             2.003905006          2,312,547      instructions              #    0.42  insn per cycle
             2.003905006            451,691      branches                  #    0.056 M/sec
             2.003905006             37,925      branch-misses             #    8.40% of all branches
      
        root@kbl-ppc:~# perf stat -I1000 --interval-count 2 --summary
        #           time             counts unit events
             1.001313128           8,013.20 msec cpu-clock                 #    8.013 CPUs utilized
             1.001313128                 83      context-switches          #    0.010 K/sec
             1.001313128                  8      cpu-migrations            #    0.001 K/sec
             1.001313128                  0      page-faults               #    0.000 K/sec
             1.001313128          4,470,950      cycles                    #    0.001 GHz
             1.001313128          1,440,045      instructions              #    0.32  insn per cycle
             1.001313128            283,222      branches                  #    0.035 M/sec
             1.001313128             33,576      branch-misses             #   11.86% of all branches
             2.003857385           8,020.34 msec cpu-clock                 #    8.020 CPUs utilized
             2.003857385                154      context-switches          #    0.019 K/sec
             2.003857385                  8      cpu-migrations            #    0.001 K/sec
             2.003857385                  2      page-faults               #    0.000 K/sec
             2.003857385          4,515,676      cycles                    #    0.001 GHz
             2.003857385          2,180,449      instructions              #    0.48  insn per cycle
             2.003857385            435,254      branches                  #    0.054 M/sec
             2.003857385             31,179      branch-misses             #    7.16% of all branches
      
         Performance counter stats for 'system wide':
      
                 16,033.53 msec cpu-clock                 #    7.992 CPUs utilized
                       237      context-switches          #    0.015 K/sec
                        16      cpu-migrations            #    0.001 K/sec
                         2      page-faults               #    0.000 K/sec
                 8,986,626      cycles                    #    0.001 GHz
                 3,620,494      instructions              #    0.40  insn per cycle
                   718,476      branches                  #    0.045 M/sec
                    64,755      branch-misses             #    9.01% of all branches
      
               2.006124542 seconds time elapsed
      
      Fixes: c7e5b328 ("perf stat: Report summary for interval mode")
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200903010113.32232-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ee6a9614
    • Tzvetomir Stoyanov (VMware)'s avatar
      libtraceevent: Fix build warning on 32-bit arches · 10a6f5c3
      Tzvetomir Stoyanov (VMware) authored
      Fixed a compilation warning for casting to pointer from integer of
      different size on 32-bit platforms.
      Reported-by: default avatarArnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
      Signed-off-by: default avatarTzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: linux-trace-devel@vger.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      10a6f5c3
    • Namhyung Kim's avatar
      perf jevents: Fix suspicious code in fixregex() · e62458e3
      Namhyung Kim authored
      The new string should have enough space for the original string and the
      back slashes IMHO.
      
      Fixes: fbc2844e ("perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: William Cohen <wcohen@redhat.com>
      Link: http://lore.kernel.org/lkml/20200903152510.489233-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e62458e3
    • Arnaldo Carvalho de Melo's avatar
      perf parse-events: Use uintptr_t when casting numbers to pointers · 0823f768
      Arnaldo Carvalho de Melo authored
      To address these errors found when cross building from x86_64 to MIPS
      little endian 32-bit:
      
          CC       /tmp/build/perf/util/parse-events-bison.o
        util/parse-events.y: In function 'parse_events_parse':
        util/parse-events.y:514:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
          514 |      (void *) $2, $6, $4);
              |      ^
        util/parse-events.y:531:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
          531 |       (void *) $2, NULL, $4)) {
              |       ^
        util/parse-events.y:547:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
          547 |      (void *) $2, $4, 0);
              |      ^
        util/parse-events.y:564:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
          564 |       (void *) $2, NULL, 0)) {
              |       ^
      
      Fixes: cabbf268 ("perf parse: Before yyabort-ing free components")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0823f768
    • Paul Barker's avatar
      doc: net: dsa: Fix typo in config code sample · af0ae997
      Paul Barker authored
      In the "single port" example code for configuring a DSA switch without
      tagging support from userspace the command to bring up the "lan2" link
      was typo'd.
      Signed-off-by: default avatarPaul Barker <pbarker@konsulko.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af0ae997
    • Linus Torvalds's avatar
      Merge tag 'fixes-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock · e28f0104
      Linus Torvalds authored
      Pull misc build failure fixes from Mike Rapoport:
       "Fix min_low_pfn/max_low_pfn build errors on ia64 and microblaze.
      
        Some configurations of ia64 and microblaze use min_low_pfn and
        max_low_pfn in pfn_valid(). This causes build failures for modules
        that use pfn_valid().
      
        The fix is to add EXPORT_SYMBOL() for these variables on ia64 and
        microblaze"
      
      * tag 'fixes-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
        ia64: fix min_low_pfn/max_low_pfn build errors
        microblaze: fix min_low_pfn/max_low_pfn build errors
      e28f0104
    • Linus Torvalds's avatar
      Merge tag 'affs-for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 26acd8b0
      Linus Torvalds authored
      Pull affs fix from David Sterba:
       "One fix to make permissions work the same way as on AmigaOS"
      
      * tag 'affs-for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        affs: fix basic permission bits to actually work
      26acd8b0
    • Linus Torvalds's avatar
      Merge tag 'media/v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 0fdf68c7
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
      
       - a compilation fix issue with ti-vpe on arm 32 bits
      
       - two Kconfig fixes for imx214 and max9286 drivers
      
       - a kernel information leak at v4l2-core on time32 compat ioctls
      
       - some fixes at rc core unbind logic
      
       - a fix at mceusb driver for it to not use GFP_ATOMIC
      
       - fixes at cedrus and vicodec drivers at the control handling logic
      
       - a fix at gpio-ir-tx to avoid disabling interruts on a spinlock
      
      * tag 'media/v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        media: mceusb: Avoid GFP_ATOMIC where it is not needed
        media: gpio-ir-tx: spinlock is not needed to disable interrupts
        media: rc: do not access device via sysfs after rc_unregister_device()
        media: rc: uevent sysfs file races with rc_unregister_device()
        media: max9286: Depend on OF_GPIO
        media: i2c: imx214: select V4L2_FWNODE
        media: cedrus: Add missing v4l2_ctrl_request_hdl_put()
        media: vicodec: add missing v4l2_ctrl_request_hdl_put()
        media: media/v4l2-core: Fix kernel-infoleak in video_put_user()
        media: ti-vpe: cal: Fix compilation on 32-bit ARM
      0fdf68c7
  4. 02 Sep, 2020 11 commits
    • Dan Murphy's avatar
      net: dp83867: Fix WoL SecureOn password · 8b4a11c6
      Dan Murphy authored
      Fix the registers being written to as the values were being over written
      when writing the same registers.
      
      Fixes: caabee5b ("net: phy: dp83867: support Wake on LAN")
      Signed-off-by: default avatarDan Murphy <dmurphy@ti.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b4a11c6
    • Louis Peens's avatar
      nfp: flower: fix ABI mismatch between driver and firmware · f614e536
      Louis Peens authored
      Fix an issue where the driver wrongly detected ipv6 neighbour updates
      from the NFP as corrupt. Add a reserved field on the kernel side so
      it is similar to the ipv4 version of the struct and has space for the
      extra bytes from the card.
      
      Fixes: 9ea9bfa1 ("nfp: flower: support ipv6 tunnel keep-alive messages from fw")
      Signed-off-by: default avatarLouis Peens <louis.peens@netronome.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@netronome.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f614e536
    • Tetsuo Handa's avatar
      tipc: fix shutdown() of connectionless socket · 2a63866c
      Tetsuo Handa authored
      syzbot is reporting hung task at nbd_ioctl() [1], for there are two
      problems regarding TIPC's connectionless socket's shutdown() operation.
      
      ----------
      #include <fcntl.h>
      #include <sys/socket.h>
      #include <sys/ioctl.h>
      #include <linux/nbd.h>
      #include <unistd.h>
      
      int main(int argc, char *argv[])
      {
              const int fd = open("/dev/nbd0", 3);
              alarm(5);
              ioctl(fd, NBD_SET_SOCK, socket(PF_TIPC, SOCK_DGRAM, 0));
              ioctl(fd, NBD_DO_IT, 0); /* To be interrupted by SIGALRM. */
              return 0;
      }
      ----------
      
      One problem is that wait_for_completion() from flush_workqueue() from
      nbd_start_device_ioctl() from nbd_ioctl() cannot be completed when
      nbd_start_device_ioctl() received a signal at wait_event_interruptible(),
      for tipc_shutdown() from kernel_sock_shutdown(SHUT_RDWR) from
      nbd_mark_nsock_dead() from sock_shutdown() from nbd_start_device_ioctl()
      is failing to wake up a WQ thread sleeping at wait_woken() from
      tipc_wait_for_rcvmsg() from sock_recvmsg() from sock_xmit() from
      nbd_read_stat() from recv_work() scheduled by nbd_start_device() from
      nbd_start_device_ioctl(). Fix this problem by always invoking
      sk->sk_state_change() (like inet_shutdown() does) when tipc_shutdown() is
      called.
      
      The other problem is that tipc_wait_for_rcvmsg() cannot return when
      tipc_shutdown() is called, for tipc_shutdown() sets sk->sk_shutdown to
      SEND_SHUTDOWN (despite "how" is SHUT_RDWR) while tipc_wait_for_rcvmsg()
      needs sk->sk_shutdown set to RCV_SHUTDOWN or SHUTDOWN_MASK. Fix this
      problem by setting sk->sk_shutdown to SHUTDOWN_MASK (like inet_shutdown()
      does) when the socket is connectionless.
      
      [1] https://syzkaller.appspot.com/bug?id=3fe51d307c1f0a845485cf1798aa059d12bf18b2Reported-by: default avatarsyzbot <syzbot+e36f41d207137b5d12f7@syzkaller.appspotmail.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a63866c
    • Ido Schimmel's avatar
      ipv6: Fix sysctl max for fib_multipath_hash_policy · 05d44871
      Ido Schimmel authored
      Cited commit added the possible value of '2', but it cannot be set. Fix
      it by adjusting the maximum value to '2'. This is consistent with the
      corresponding IPv4 sysctl.
      
      Before:
      
      # sysctl -w net.ipv6.fib_multipath_hash_policy=2
      sysctl: setting key "net.ipv6.fib_multipath_hash_policy": Invalid argument
      net.ipv6.fib_multipath_hash_policy = 2
      # sysctl net.ipv6.fib_multipath_hash_policy
      net.ipv6.fib_multipath_hash_policy = 0
      
      After:
      
      # sysctl -w net.ipv6.fib_multipath_hash_policy=2
      net.ipv6.fib_multipath_hash_policy = 2
      # sysctl net.ipv6.fib_multipath_hash_policy
      net.ipv6.fib_multipath_hash_policy = 2
      
      Fixes: d8f74f09 ("ipv6: Support multipath hashing on inner IP pkts")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarStephen Suryaputra <ssuryaextr@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05d44871
    • Xie He's avatar
      drivers/net/wan/hdlc: Change the default of hard_header_len to 0 · 2b7bcd96
      Xie He authored
      Change the default value of hard_header_len in hdlc.c from 16 to 0.
      
      Currently there are 6 HDLC protocol drivers, among them:
      
      hdlc_raw_eth, hdlc_cisco, hdlc_ppp, hdlc_x25 set hard_header_len when
      attaching the protocol, overriding the default. So this patch does not
      affect them.
      
      hdlc_raw and hdlc_fr don't set hard_header_len when attaching the
      protocol. So this patch will change the hard_header_len of the HDLC
      device for them from 16 to 0.
      
      This is the correct change because both hdlc_raw and hdlc_fr don't have
      header_ops, and the code in net/packet/af_packet.c expects the value of
      hard_header_len to be consistent with header_ops.
      
      In net/packet/af_packet.c, in the packet_snd function,
      for AF_PACKET/DGRAM sockets it would reserve a headroom of
      hard_header_len and call dev_hard_header to fill in that headroom,
      and for AF_PACKET/RAW sockets, it does not reserve the headroom and
      does not call dev_hard_header, but checks if the user has provided a
      header of length hard_header_len (in function dev_validate_header).
      
      Cc: Krzysztof Halasa <khc@pm.waw.pl>
      Cc: Martin Schiller <ms@dev.tdt.de>
      Signed-off-by: default avatarXie He <xie.he.0141@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b7bcd96
    • Dan Carpenter's avatar
      net: gemini: Fix another missing clk_disable_unprepare() in probe · eb0f3bc4
      Dan Carpenter authored
      We recently added some calls to clk_disable_unprepare() but we missed
      the last error path if register_netdev() fails.
      
      I made a couple cleanups so we avoid mistakes like this in the future.
      First I reversed the "if (!ret)" condition and pulled the code in one
      indent level.  Also, the "port->netdev = NULL;" is not required because
      "port" isn't used again outside this function so I deleted that line.
      
      Fixes: 4d5ae32f ("net: ethernet: Add a driver for Gemini gigabit ethernet")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb0f3bc4
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · fc3abb53
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - data sanitization and validtion fixes for report descriptor parser
         from Marc Zyngier
      
       - memory leak fix for hid-elan driver from Dinghao Liu
      
       - two device-specific quirks
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: core: Sanitize event code and type when mapping input
        HID: core: Correctly handle ReportSize being zero
        HID: elan: Fix memleak in elan_input_configured
        HID: microsoft: Add rumble support for the 8bitdo SN30 Pro+ controller
        HID: quirks: Set INCREMENT_USAGE_ON_DUPLICATE for all Saitek X52 devices
      fc3abb53
    • Denis Efremov's avatar
      net: bcmgenet: fix mask check in bcmgenet_validate_flow() · 1996cf46
      Denis Efremov authored
      VALIDATE_MASK(eth_mask->h_source) is checked twice in a row in
      bcmgenet_validate_flow(). Add VALIDATE_MASK(eth_mask->h_dest)
      instead.
      
      Fixes: 3e370952 ("net: bcmgenet: add support for ethtool rxnfc flows")
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      Acked-by: default avatarDoug Berger <opendmb@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1996cf46
    • Shyam Sundar S K's avatar
      amd-xgbe: Add support for new port mode · 7deedd9f
      Shyam Sundar S K authored
      Add support for a new port mode that is a backplane connection without
      support for auto negotiation.
      Signed-off-by: default avatarShyam Sundar S K <Shyam-sundar.S-k@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7deedd9f
    • Linus Torvalds's avatar
      Merge tag 'for-5.9/dm-fixes' of... · c3a13095
      Linus Torvalds authored
      Merge tag 'for-5.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - writecache fix to allow dax_direct_access() to partitioned pmem
         devices.
      
       - multipath fix to avoid any Path Group initialization if
         'pg_init_in_progress' isn't set.
      
       - crypt fix to use DECLARE_CRYPTO_WAIT() for onstack wait structures.
      
       - integrity fix to properly check integrity after device creation when
         in bitmap mode.
      
       - thinp and cache target __create_persistent_data_objects() fixes to
         reset the metadata's dm_block_manager pointer from PTR_ERR to NULL
         before returning from error path.
      
       - persistent-data block manager fix to guard against dm_block_manager
         NULL pointer dereference in dm_bm_is_read_only() and update various
         opencoded bm->read_only checks to use dm_bm_is_read_only() instead.
      
      * tag 'for-5.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm thin metadata: Fix use-after-free in dm_bm_set_read_only
        dm thin metadata:  Avoid returning cmd->bm wild pointer on error
        dm cache metadata: Avoid returning cmd->bm wild pointer on error
        dm integrity: fix error reporting in bitmap mode after creation
        dm crypt: Initialize crypto wait structures
        dm mpath: fix racey management of PG initialization
        dm writecache: handle DAX to partitions on persistent memory correctly
      c3a13095
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · e1d0126c
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Various small corruption fixes that have come in during the past
        month:
      
         - Avoid a log recovery failure for an insert range operation by
           rolling deferred ops incrementally instead of at the end.
      
         - Fix an off-by-one error when calculating log space reservations for
           anything involving an inode allocation or free.
      
         - Fix a broken shortform xattr verifier.
      
         - Ensure that the shortform xattr header padding is always
           initialized to zero"
      
      * tag 'xfs-5.9-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: initialize the shortform attr header padding entry
        xfs: fix boundary test in xfs_attr_shortform_verify
        xfs: fix off-by-one in inode alloc block reservation calculation
        xfs: finish dfops on every insert range shift iteration
      e1d0126c