• Thomas Richter's avatar
    perf lock contention: Fix core dump related to not finding the "__sched_text_end" symbol on s/390 · d8d85ce8
    Thomas Richter authored
    The test case perf lock contention dumps core on s390. Run the following
    commands:
    
      # ./perf lock record -- ./perf bench sched messaging
      # Running 'sched/messaging' benchmark:
      # 20 sender and receiver processes per group
      # 10 groups == 400 processes run
    
          Total time: 2.799 [sec]
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ]
      #
      # ./perf lock contention
      Segmentation fault (core dumped)
      #
    
    The function call stack is lengthy, here are the top 5 functions:
    
      # gdb ./perf core.24048
      GNU gdb (GDB) Fedora Linux 12.1-6.fc37
      Core was generated by `./perf lock contention'.
      Program terminated with signal SIGSEGV, Segmentation fault.
      #0  0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356
             3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start);
    
     (gdb) where
      #0  0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356
      #1  0x000000000109f244 in callchain_id (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:957
      #2  0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290, addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:586
      #3  0x000000000109f4d0 in report_lock_contention_begin_event (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1004
      #4  0x00000000010a00ae in evsel__process_contention_begin (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1254
      #5  0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, event=0x3ff85601ef8, sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464
      .....
    
    The issue is in function machine__is_lock_function() in file
    ./util/machine.c lines 3355:
    
       /* should not fail from here */
       sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end", &kmap);
       machine->sched.text_end = kmap->unmap_ip(kmap, sym->start)
    
    On s390 the symbol __sched_text_end is *NOT* in the symbol list and the
    resulting pointer sym is set to NULL. The sym->start is then a NULL pointer
    access and generates the core dump.
    
    The reason why __sched_text_end is not in the symbol list on s390 is
    simple:
    
    When the symbol list is created at perf start up with function calls
    
      dso__load
      +--> dso__load_vmlinux_path
           +--> dso__load_vmlinux
                +--> dso__load_sym
    	         +--> dso__load_sym_internal (reads kernel symbols)
    		 +--> symbols__fixup_end
    		 +--> symbols__fixup_duplicate
    
    The issue is in function symbols__fixup_duplicate(). It deletes all
    symbols with have the same address. On s390:
    
      # nm -g  ~/linux/vmlinux| fgrep c68390
      0000000000c68390 T __cpuidle_text_start
      0000000000c68390 T __sched_text_end
      #
    
    two symbols have identical addresses and __sched_text_end is considered
    duplicate (in ascending sort order) and removed from the symbol list.
    Therefore it is missing and an invalid pointer reference occurs.  The
    code checks for symbol __sched_text_start and when it exists assumes
    symbol __sched_text_end is also in the symbol table. However this is not
    the case on s390.
    
    Same situation exists for symbol __lock_text_start:
    
    0000000000c68770 T __cpuidle_text_end
    0000000000c68770 T __lock_text_start
    
    This symbol is also removed from the symbol table but used in function
    machine__is_lock_function().
    
    To fix this and keep duplicate symbols in the symbol table, set
    symbol_conf.allow_aliases to true. This prevents the removal of
    duplicate symbols in function symbols__fixup_duplicate().
    
    Output After:
    
     # ./perf lock contention
     contended total wait  max wait  avg wait    type   caller
    
            48   124.39 ms 123.99 ms   2.59 ms rwsem:W unlink_anon_vmas+0x24a
            47    83.68 ms  83.26 ms   1.78 ms rwsem:W free_pgtables+0x132
             5    41.22 us  10.55 us   8.24 us rwsem:W free_pgtables+0x140
             4    40.12 us  20.55 us  10.03 us rwsem:W copy_process+0x1ac8
     #
    
    Fixes: 0d2997f7 ("perf lock: Look up callchain for the contended locks")
    Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
    Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
    Cc: Sven Schnelle <svens@linux.ibm.com>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Link: https://lore.kernel.org/r/20221230102627.2410847-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    d8d85ce8
builtin-lock.c 51.8 KB