• Eric DeVolder's avatar
    crash: move a few code bits to setup support of crash hotplug · 6f991cc3
    Eric DeVolder authored
    Patch series "crash: Kernel handling of CPU and memory hot un/plug", v28.
    
    Once the kdump service is loaded, if changes to CPUs or memory occur,
    either by hot un/plug or off/onlining, the crash elfcorehdr must also be
    updated.
    
    The elfcorehdr describes to kdump the CPUs and memory in the system, and
    any inaccuracies can result in a vmcore with missing CPU context or memory
    regions.
    
    The current solution utilizes udev to initiate an unload-then-reload of
    the kdump image (eg.  kernel, initrd, boot_params, purgatory and
    elfcorehdr) by the userspace kexec utility.  In the original post I
    outlined the significant performance problems related to offloading this
    activity to userspace.
    
    This patchset introduces a generic crash handler that registers with the
    CPU and memory notifiers.  Upon CPU or memory changes, from either hot
    un/plug or off/onlining, this generic handler is invoked and performs
    important housekeeping, for example obtaining the appropriate lock, and
    then invokes an architecture specific handler to do the appropriate
    elfcorehdr update.
    
    Note the description in patch 'crash: change crash_prepare_elf64_headers()
    to for_each_possible_cpu()' and 'x86/crash: optimize CPU changes' that
    enables further optimizations related to CPU plug/unplug/online/offline
    performance of elfcorehdr updates.
    
    In the case of x86_64, the arch specific handler generates a new
    elfcorehdr, and overwrites the old one in memory; thus no involvement with
    userspace needed.
    
    To realize the benefits/test this patchset, one must make a couple
    of minor changes to userspace:
    
     - Prevent udev from updating kdump crash kernel on hot un/plug changes.
       Add the following as the first lines to the RHEL udev rule file
       /usr/lib/udev/rules.d/98-kexec.rules:
    
       # The kernel updates the crash elfcorehdr for CPU and memory changes
       SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
       SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
    
       With this changeset applied, the two rules evaluate to false for
       CPU and memory change events and thus skip the userspace
       unload-then-reload of kdump.
    
     - Change to the kexec_file_load for loading the kdump kernel:
       Eg. on RHEL: in /usr/bin/kdumpctl, change to:
        standard_kexec_args="-p -d -s"
       which adds the -s to select kexec_file_load() syscall.
    
    This kernel patchset also supports kexec_load() with a modified kexec
    userspace utility.  A working changeset to the kexec userspace utility is
    posted to the kexec-tools mailing list here:
    
     http://lists.infradead.org/pipermail/kexec/2023-May/027049.html
    
    To use the kexec-tools patch, apply, build and install kexec-tools, then
    change the kdumpctl's standard_kexec_args to replace the -s with
    --hotplug.  The removal of -s reverts to the kexec_load syscall and the
    addition of --hotplug invokes the changes put forth in the kexec-tools
    patch.
    
    
    This patch (of 8):
    
    The crash hotplug support leans on the work for the kexec_file_load()
    syscall.  To also support the kexec_load() syscall, a few bits of code
    need to be move outside of CONFIG_KEXEC_FILE.  As such, these bits are
    moved out of kexec_file.c and into a common location crash_core.c.
    
    In addition, struct crash_mem and crash_notes were moved to new locales so
    that PROC_KCORE, which sets CRASH_CORE alone, builds correctly.
    
    No functionality change intended.
    
    Link: https://lkml.kernel.org/r/20230814214446.6659-1-eric.devolder@oracle.com
    Link: https://lkml.kernel.org/r/20230814214446.6659-2-eric.devolder@oracle.comSigned-off-by: default avatarEric DeVolder <eric.devolder@oracle.com>
    Reviewed-by: default avatarSourabh Jain <sourabhjain@linux.ibm.com>
    Acked-by: default avatarHari Bathini <hbathini@linux.ibm.com>
    Acked-by: default avatarBaoquan He <bhe@redhat.com>
    Cc: Akhil Raj <lf32.dev@gmail.com>
    Cc: Bjorn Helgaas <bhelgaas@google.com>
    Cc: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Eric W. Biederman <ebiederm@xmission.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Mimi Zohar <zohar@linux.ibm.com>
    Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: "Rafael J. Wysocki" <rafael@kernel.org>
    Cc: Sean Christopherson <seanjc@google.com>
    Cc: Takashi Iwai <tiwai@suse.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Thomas Weißschuh <linux@weissschuh.net>
    Cc: Valentin Schneider <vschneid@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    6f991cc3
crash_core.h 3.96 KB