• Yunlong Song's avatar
    perf sched replay: Increase the MAX_PID value to fix assertion failure problem · a35e27d0
    Yunlong Song authored
    Current MAX_PID is only 65536, which will cause assertion failure problem
    when CPU cores are more than 64 in x86_64.
    
    This is because the pid_max value in x86_64 is at least
    PIDS_PER_CPU_DEFAULT * num_possible_cpus() (see function pidmap_init
    defined in kernel/pid.c), where PIDS_PER_CPU_DEFAULT is 1024 (defined in
    include/linux/threads.h).
    
    Thus for MAX_PID = 65536, the correspoinding CPU cores are
    65536/1024=64.  This is obviously not enough at all for x86_64, and will
    cause an assertion failure problem due to BUG_ON(pid >= MAX_PID) in the
    codes.
    
    We increase MAX_PID value from 65536 to 1024*1000, which can be used in
    x86_64 with 1000 cores.
    
    This number is finally decided according to the limitation of stack size
    of calling process.
    
    Use 'ulimit -a', the result shows the stack size of any process is 8192
    Kbytes, which is defined in include/uapi/linux/resource.h (#define
    _STK_LIM (8*1024*1024)).
    
    Thus we choose a large enough value for MAX_PID, and make it satisfy to
    the limitation of the stack size, i.e., making the perf process take up
    a memory space just smaller than 8192 Kbytes.
    
    We have calculated and tested that 1024*1000 is OK for MAX_PID.
    
    This means perf sched replay can now be used with at most 1000 cores in
    x86_64 without any assertion failure problem.
    
    Example:
    
    Test environment: x86_64 with 160 cores
    
     $ cat /proc/sys/kernel/pid_max
     163840
    
    Before this patch:
    
     $ perf sched replay
     run measurement overhead: 240 nsecs
     sleep measurement overhead: 55379 nsecs
     the run test took 1000004 nsecs
     the sleep test took 1059424 nsecs
     perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 65536)'
     failed.
     Aborted
    
    After this patch:
    
     $ perf sched replay
     run measurement overhead: 221 nsecs
     sleep measurement overhead: 55397 nsecs
     the run test took 999920 nsecs
     the sleep test took 1053313 nsecs
     nr_run_events:        10
     nr_sleep_events:      1562
     nr_wakeup_events:     5
     task      0 (                  :1:         1), nr_events: 1
     task      1 (                  :2:         2), nr_events: 1
     task      2 (                  :3:         3), nr_events: 1
     task      3 (                  :5:         5), nr_events: 1
     ...
    Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Wang Nan <wangnan0@huawei.com>
    Link: http://lkml.kernel.org/r/1427809596-29559-3-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    a35e27d0
builtin-sched.c 43.8 KB