• Michael Ellerman's avatar
    powerpc/64: Move paca allocation later in boot · 2e7f1e2b
    Michael Ellerman authored
    Mahesh & Sourabh identified two problems[1][2] with ppc64_bolted_size()
    and paca allocation.
    
    The first is that on a Radix capable machine but with "disable_radix" on
    the command line, there is a window during early boot where
    early_radix_enabled() is true, even though it will later become false.
    
      early_init_devtree:                       <- early_radix_enabled() = false
        early_init_dt_scan_cpus:                <- early_radix_enabled() = false
            ...
            check_cpu_pa_features:              <- early_radix_enabled() = false
            ...                               ^ <- early_radix_enabled() = TRUE
            allocate_paca:                    | <- early_radix_enabled() = TRUE
                ...                           |
                ppc64_bolted_size:            | <- early_radix_enabled() = TRUE
                    if (early_radix_enabled())| <- early_radix_enabled() = TRUE
                        return ULONG_MAX;     |
            ...                               |
        ...                                   | <- early_radix_enabled() = TRUE
        ...                                   | <- early_radix_enabled() = TRUE
        mmu_early_init_devtree()              V
        ...                                     <- early_radix_enabled() = false
    
    This causes ppc64_bolted_size() to return ULONG_MAX for the boot CPU's
    paca allocation, even though later it will return a different value.
    This is not currently a bug because the paca allocation is also limited
    by the RMA size, but that is very fragile.
    
    The second issue is that when using the Hash MMU, when we call
    ppc64_bolted_size() for the boot CPU's paca allocation, we have not yet
    detected whether 1T segments are available. That causes
    ppc64_bolted_size() to return 256MB, even if the machine can actually
    support up to 1T. This is usually OK, we generally have space below
    256MB for one paca, but for a kdump kernel placed above 256MB it causes
    the boot to fail.
    
    At boot we cannot discover all the features of the machine
    instantaneously, so there will always be some periods where we have
    incomplete knowledge of the system. However both the above problems stem
    from the fact that we allocate the boot CPU's paca (and paca pointers
    array) before we decide which MMU we are using, or discover its exact
    features.
    
    Moving the paca allocation slightly later still can solve both the
    issues described above, and means for a normal boot we don't do any
    permanent allocations until after we've discovered the MMU.
    
    Note that although we move the boot CPU's paca allocation later, we
    still have a temporary paca (boot_paca) accessible via r13, so code that
    does read only access to paca fields is safe. The only risk is that some
    code writes to the boot_paca, and that write will then be lost when we
    switch away from the boot_paca later in early_setup().
    
    The additional code that runs before the paca allocation is primarily
    mmu_early_init_devtree(), which is scanning the device tree and
    populating globals and cur_cpu_spec with MMU related flags. I do not see
    any additional code that writes to paca fields.
    
    [1]: https://lore.kernel.org/r/20211018084434.217772-2-sourabhjain@linux.ibm.com
    [2]: https://lore.kernel.org/r/20211018084434.217772-3-sourabhjain@linux.ibm.comSigned-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20220124130544.408675-1-mpe@ellerman.id.au
    2e7f1e2b
prom.c 25 KB