• Martin Schwidefsky's avatar
    s390/cpumf: simplify detection of guest samples · df26c2e8
    Martin Schwidefsky authored
    There are three different code levels in regard to the identification
    of guest samples. They differ in the way the LPP instruction is used.
    
    1) Old kernels without the LPP instruction. The guest program parameter
       is always zero.
    2) Newer kernels load the process pid into the program parameter with LPP.
       The guest program parameter is non-zero if the guest executes in a
       process != idle.
    3) The latest kernels load ((1UL << 31) | pid) with LPP to make the value
       non-zero even for the idle task. The guest program parameter is non-zero
       if the guest is running.
    
    All kernels load the process pid to CR4 on context switch. The CPU sampling
    code uses the value in CR4 to decide between guest and host samples in case
    the guest program parameter is zero. The three cases:
    
    1) CR4==pid, gpp==0
    2) CR4==pid, gpp==pid
    3) CR4==pid, gpp==((1UL << 31) | pid)
    
    The load-control instruction to load the pid into CR4 is expensive and the
    goal is to remove it. To distinguish the host CR4 from the guest pid for
    the idle process the maximum value 0xffff for the PASN is used.
    This adds a fourth case for a guest OS with an updated kernel:
    
    4) CR4==0xffff, gpp=((1UL << 31) | pid)
    
    The host kernel will have CR4==0xffff and will use (gpp!=0 || CR4!==0xffff)
    to identify guest samples. This works nicely with all 4 cases, the only
    possible issue would be a guest with an old kernel (gpp==0) and a process
    pid of 0xffff. Well, don't do that..
    Suggested-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
    Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
    Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
    df26c2e8
perf_cpum_sf.c 45.9 KB