• Lukasz Anaczkowski's avatar
    x86, ACPI: Handle apic/x2apic entries in MADT in correct order · d81056b5
    Lukasz Anaczkowski authored
    ACPI specifies the following rules when listing APIC IDs:
    (1) Boot processor is listed first
    (2) For multi-threaded processors, BIOS should list the first logical
        processor of each of the individual multi-threaded processors in MADT
        before listing any of the second logical processors.
    (3) APIC IDs < 0xFF should be listed in APIC subtable, APIC IDs >= 0xFF
        should be listed in X2APIC subtable
    
    Because of above, when there's more than 0xFF logical CPUs, BIOS
    interleaves APIC/X2APIC subtables.
    
    Assuming, there's 72 cores, 72 hyper-threads each, 288 CPUs total,
    listing is like this:
    
    APIC (0,4,8, .., 252)
    X2APIC (258,260,264, .. 284)
    APIC (1,5,9,...,253)
    X2APIC (259,261,265,...,285)
    APIC (2,6,10,...,254)
    X2APIC (260,262,266,..,286)
    APIC (3,7,11,...,251)
    X2APIC (255,261,262,266,..,287)
    
    Now, before this patch, due to how ACPI MADT subtables were parsed (BSP
    then X2APIC then APIC), kernel enumerated CPUs in reverted order (i.e.
    high APIC IDs were getting low logical IDs, and low APIC IDs were
    getting high logical IDs).
    This is wrong for the following reasons:
    () it's hard to predict how cores and threads are enumerated
    () when it's hard to predict, s/w threads cannot be properly affinitized
       causing significant performance impact due to e.g. inproper cache
       sharing
    () enumeration is inconsistent with how threads are enumerated on
       other Intel Xeon processors
    
    So, order in which MADT APIC/X2APIC handlers are passed is
    reverse and both handlers are passed to be called during same MADT
    table to walk to achieve correct CPU enumeration.
    
    In scenario when someone boots kernel with options 'maxcpus=72 nox2apic',
    in result less cores may be booted, since some of the CPUs the kernel
    will try to use will have APIC ID >= 0xFF. In such case, one
    should not pass 'nox2apic'.
    
    Disclimer: code parsing MADT APIC/X2APIC has not been touched since 2009,
    when X2APIC support was initially added. I do not know why MADT parsing
    code was added in the reversed order in the first place.
    I guess it didn't matter at that time since nobody cared about cores
    with APIC IDs >= 0xFF, right?
    
    This patch is based on work of "Yinghai Lu <yinghai@kernel.org>"
    previously published at https://lkml.org/lkml/2013/1/21/563
    
    Here's the explanation why parsing interface needs to be changed
    and why simpler approach will not work https://lkml.org/lkml/2015/9/7/285Signed-off-by: default avatarLukasz Anaczkowski <lukasz.anaczkowski@intel.com>
    Acked-by: Thomas Gleixner <tglx@linutronix.de> (commit message)
    Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
    d81056b5
boot.c 40.9 KB