- 26 Mar, 2020 5 commits
-
-
Cédric Le Goater authored
As does XMON, the debugfs file /sys/kernel/debug/powerpc/xive exposes the XIVE internal state of the machine CPUs and interrupts. Available on the PowerNV and sPAPR platforms. Signed-off-by: Cédric Le Goater <clg@kaod.org> [mpe: Make the debugfs file 0400] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-5-clg@kaod.org
-
Cédric Le Goater authored
Some firmwares or hypervisors can advertise different source characteristics. Track their value under XMON. What we are mostly interested in is the StoreEOI flag. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-4-clg@kaod.org
-
Cédric Le Goater authored
The PowerNV platform has multiple IRQ chips and the xmon command dumping the state of the XIVE interrupt should only operate on the XIVE IRQ chip. Fixes: 5896163f ("powerpc/xmon: Improve output of XIVE interrupts") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-3-clg@kaod.org
-
Cédric Le Goater authored
When a CPU is brought up, an IPI number is allocated and recorded under the XIVE CPU structure. Invalid IPI numbers are tracked with interrupt number 0x0. On the PowerNV platform, the interrupt number space starts at 0x10 and this works fine. However, on the sPAPR platform, it is possible to allocate the interrupt number 0x0 and this raises an issue when CPU 0 is unplugged. The XIVE spapr driver tracks allocated interrupt numbers in a bitmask and it is not correctly updated when interrupt number 0x0 is freed. It stays allocated and it is then impossible to reallocate. Fix by using the XIVE_BAD_IRQ value instead of zero on both platforms. Reported-by: David Gibson <david@gibson.dropbear.id.au> Fixes: eac1e731 ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-2-clg@kaod.org
-
Nick Desaulniers authored
Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> [mpe: Drop changes to a/p/boot which doesn't use linux headers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190812215052.71840-10-ndesaulniers@google.com
-
- 25 Mar, 2020 27 commits
-
-
Fabiano Rosas authored
The if statement that this comment referred to was removed in commit 11fdb309 ("powerpc/prom_init: Remove support for OPAL v2"). Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200324182912.1048906-1-farosas@linux.ibm.com
-
Christophe Leroy authored
When a program check exception happens while MMU translation is disabled, following Oops happens in kprobe_handler() in the following code: } else if (*addr != BREAKPOINT_INSTRUCTION) { BUG: Unable to handle kernel data access on read at 0x0000e268 Faulting instruction address: 0xc000ec34 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=16K PREEMPT CMPC885 Modules linked in: CPU: 0 PID: 429 Comm: cat Not tainted 5.6.0-rc1-s3k-dev-00824-g84195dc6c58a #3267 NIP: c000ec34 LR: c000ecd8 CTR: c019cab8 REGS: ca4d3b58 TRAP: 0300 Not tainted (5.6.0-rc1-s3k-dev-00824-g84195dc6c58a) MSR: 00001032 <ME,IR,DR,RI> CR: 2a4d3c52 XER: 00000000 DAR: 0000e268 DSISR: c0000000 GPR00: c000b09c ca4d3c10 c66d0620 00000000 ca4d3c60 00000000 00009032 00000000 GPR08: 00020000 00000000 c087de44 c000afe0 c66d0ad0 100d3dd6 fffffff3 00000000 GPR16: 00000000 00000041 00000000 ca4d3d70 00000000 00000000 0000416d 00000000 GPR24: 00000004 c53b6128 00000000 0000e268 00000000 c07c0000 c07bb6fc ca4d3c60 NIP [c000ec34] kprobe_handler+0x128/0x290 LR [c000ecd8] kprobe_handler+0x1cc/0x290 Call Trace: [ca4d3c30] [c000b09c] program_check_exception+0xbc/0x6fc [ca4d3c50] [c000e43c] ret_from_except_full+0x0/0x4 --- interrupt: 700 at 0xe268 Instruction dump: 913e0008 81220000 38600001 3929ffff 91220000 80010024 bb410008 7c0803a6 38210020 4e800020 38600000 4e800020 <813b0000> 6d2a7fe0 2f8a0008 419e0154 ---[ end trace 5b9152d4cdadd06d ]--- kprobe is not prepared to handle events in real mode and functions running in real mode should have been blacklisted, so kprobe_handler() can safely bail out telling 'this trap is not mine' for any trap that happened while in real-mode. If the trap happened with MSR_IR or MSR_DR cleared, return 0 immediately. Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Fixes: 6cc89bad ("powerpc/kprobes: Invoke handlers directly") Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/424331e2006e7291a1bfe40e7f3fa58825f565e1.1582054578.git.christophe.leroy@c-s.fr
-
Nathan Chancellor authored
When building ppc64 defconfig, Clang errors (trimmed for brevity): arch/powerpc/platforms/maple/setup.c:365:1: error: attribute declaration must precede definition [-Werror,-Wignored-attributes] machine_device_initcall(maple, maple_cpc925_edac_setup); ^ machine_device_initcall expands to __define_machine_initcall, which in turn has the macro machine_is used in it, which declares mach_##name with an __attribute__((weak)). define_machine actually defines mach_##name, which in this file happens before the declaration, hence the warning. To fix this, move define_machine after machine_device_initcall so that the declaration occurs before the definition, which matches how machine_device_initcall and define_machine work throughout arch/powerpc. While we're here, remove some spaces before tabs. Fixes: 8f101a05 ("edac: cpc925 MC platform device setup") Reported-by: Nick Desaulniers <ndesaulniers@google.com> Suggested-by: Ilie Halip <ilie.halip@gmail.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200323222729.15365-1-natechancellor@gmail.com
-
Nicholas Piggin authored
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320152436.1468651-1-npiggin@gmail.com
-
Oliver O'Halloran authored
With the EEH early probe now being pseries specific there's no need for eeh_ops->probe() to take a pci_dn. Instead, we can make it take a pci_dev and use the probe function to map a pci_dev to an eeh_dev. This allows the platform to implement it's own method for finding (or creating) an eeh_dev for a given pci_dev which also removes a use of pci_dn in generic EEH code. This patch also renames eeh_device_add_late() to eeh_device_probe(). This better reflects what it does does and removes the last vestiges of the early/late EEH probe split. Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-6-oohall@gmail.com
-
Oliver O'Halloran authored
The eeh_ops->probe() function is called from two different contexts: 1. On pseries, where we set EEH_PROBE_MODE_DEVTREE, it's called in eeh_add_device_early() which is supposed to run before we create a pci_dev. 2. On PowerNV, where we set EEH_PROBE_MODE_DEV, it's called in eeh_device_add_late() which is supposed to run *after* the pci_dev is created. The "early" probe is required because PAPR requires that we perform an RTAS call to enable EEH support on a device before we start interacting with it via config space or MMIO. This requirement doesn't exist on PowerNV and shoehorning two completely separate initialisation paths into a common interface just results in a convoluted code everywhere. Additionally the early probe requires the probe function to take an pci_dn rather than a pci_dev argument. We'd like to make pci_dn a pseries specific data structure since there's no real requirement for them on PowerNV. To help both goals move the early probe into the pseries containment zone so the platform depedence is more explicit. Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-5-oohall@gmail.com
-
Oliver O'Halloran authored
This check for a missing PHB has existing in various forms since the initial PPC64 port was upstreamed in 2002. The idea seems to be that we need to guard against creating pci-specific data structures for the non-pci children of a PCI device tree node (e.g. USB devices). However, we only create pci_dn structures for DT nodes that correspond to PCI devices so there's not much point in doing this check in the eeh_probe path. Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-4-oohall@gmail.com
-
Oliver O'Halloran authored
The pci hotplug helper (pci_hp_add_devices()) calls eeh_add_device_tree_early() to scan the device-tree for new PCI devices and do the early EEH probe before the device is scanned. This early probe is a no-op in a lot of cases because: a) The early init is only required to satisfy a PAPR requirement that EEH be configured before we start doing config accesses. On PowerNV it is a no-op. b) It's a no-op for devices that have already had their eeh_dev initialised. There are four callers of pci_hp_add_devices(): 1. arch/powerpc/kernel/eeh_driver.c Here the hotplug helper is called when re-scanning pci_devs that were removed during an EEH recovery pass. The EEH stat for each removed device (the eeh_dev) is retained across a recovery pass so the early init is a no-op in this case. 2. drivers/pci/hotplug/pnv_php.c This is also a no-op since the PowerNV hotplug driver is, suprisingly, PowerNV specific. 3. drivers/pci/hotplug/rpaphp_core.c 4. drivers/pci/hotplug/rpaphp_pci.c In these two cases new devices have been hotplugged and FW has provided new DT nodes for each. These are the only two cases where the EEH we might have new PCI device nodes in the DT so these are the only two cases where the early EEH probe needs to be done. We can move the calls to eeh_add_device_tree_early() to the locations where it's needed and remove it from the generic path. This is preparation for making the early EEH probe pseries specific. Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-3-oohall@gmail.com
-
Oliver O'Halloran authored
On pseries and PowerNV pcibios_bus_add_device() calls eeh_add_device_late() so there's no need to do a separate tree traversal to bind the eeh_dev and pci_dev together setting up the PHB at boot. As a result we can remove eeh_add_device_tree_late(). Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-2-oohall@gmail.com
-
Oliver O'Halloran authored
Move creating the EEH specific sysfs files into eeh_add_device_late() rather than being open-coded all over the place. Calling the function is generally done immediately after calling eeh_add_device_late() anyway. This is also a correctness fix since currently the sysfs files will be added even if the EEH probe happens to fail. Similarly, on pseries we currently add the sysfs files before calling eeh_add_device_late(). This is flat-out broken since the sysfs files require the pci_dev->dev.archdata.edev pointer to be set, and that is done in eeh_add_device_late(). Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306073904.4737-1-oohall@gmail.com
-
Michael Ellerman authored
The previous commit reduced the amount of code that is run before we setup a paca. However there are still a few remaining functions that run with no paca, or worse, with an arbitrary value in r13 that will be used as a paca pointer. In particular the stack protector canary is stored in the paca, so if stack protector is activated for any of these functions we will read the stack canary from wherever r13 points. If r13 happens to point outside of memory we will get a machine check / checkstop. For example if we modify initialise_paca() to trigger stack protection, and then boot in the mambo simulator with r13 poisoned in skiboot before calling the kernel: DEBUG: 19952232: (19952232): INSTRUCTION: PC=0xC0000000191FC1E8: [0x3C4C006D]: addis r2,r12,0x6D [fetch] DEBUG: 19952236: (19952236): INSTRUCTION: PC=0xC00000001807EAD8: [0x7D8802A6]: mflr r12 [fetch] FATAL ERROR: 19952276: (19952276): Check Stop for 0:0: Machine Check with ME bit of MSR off DEBUG: 19952276: (19952276): INSTRUCTION: PC=0xC0000000191FCA7C: [0xE90D0CF8]: ld r8,0xCF8(r13) [Instruction Failed] INFO: 19952276: (19952277): ** Execution stopped: Mambo Error, Machine Check Stop, ** systemsim % bt pc: 0xC0000000191FCA7C initialise_paca+0x54 lr: 0xC0000000191FC22C early_setup+0x44 stack:0x00000000198CBED0 0x0 +0x0 stack:0x00000000198CBF00 0xC0000000191FC22C early_setup+0x44 stack:0x00000000198CBF90 0x1801C968 +0x1801C968 So annotate the relevant functions to ensure stack protection is never enabled for them. Fixes: 06ec27ae ("powerpc/64: add stack protector support") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320032116.1024773-2-mpe@ellerman.id.au
-
Daniel Axtens authored
Currently we set up the paca after parsing the device tree for CPU features. Prior to that, r13 contains random data, which means there is random data in r13 while we're running the generic dt parsing code. This random data varies depending on whether we boot through a vmlinux or a zImage: for the vmlinux case it's usually around zero, but for zImages we see random values like 912a72603d420015. This is poor practice, and can also lead to difficult-to-debug crashes. For example, when kcov is enabled, the kcov instrumentation attempts to read preempt_count out of the current task, which goes via the paca. This then crashes in the zImage case. Similarly stack protector can cause crashes if r13 is bogus, by reading from the stack canary in the paca. To resolve this: - move the paca setup to before the CPU feature parsing. - because we no longer have access to CPU feature flags in paca setup, change the HV feature test in the paca setup path to consider the actual value of the MSR rather than the CPU feature. Translations get switched on once we leave early_setup, so I think we'd already catch any other cases where the paca or task aren't set up. Boot tested on a P9 guest and host. Fixes: fb0b0a73 ("powerpc: Enable kcov") Fixes: 06ec27ae ("powerpc/64: add stack protector support") Cc: stable@vger.kernel.org # v4.20+ Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Daniel Axtens <dja@axtens.net> [mpe: Reword comments & change log a bit to mention stack protector] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320032116.1024773-1-mpe@ellerman.id.au
-
Pratik Rajesh Sampat authored
The patch avoids allocating cpufreq_policy on stack hence fixing frame size overflow in 'powernv_cpufreq_work_fn' Fixes: 22794280 ("cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling") Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200316135743.57735-1-psampat@linux.ibm.com
-
Po-Hsu Lin authored
Some specific tests in powerpc can take longer than the default 45 seconds that added in commit 852c8cbf ("selftests/kselftest/runner.sh: Add 45 second timeout per test") to run, the following test result was collected across 2 Power8 nodes and 1 Power9 node in our pool: powerpc/benchmarks/futex_bench - 52s powerpc/dscr/dscr_sysfs_test - 116s powerpc/signal/signal_fuzzer - 88s powerpc/tm/tm_unavailable_test - 168s powerpc/tm/tm-poison - 240s Thus they will fail with TIMEOUT error. Disable the timeout setting for these sub-tests to allow them finish properly. https://bugs.launchpad.net/bugs/1864642 Fixes: 852c8cbf ("selftests/kselftest/runner.sh: Add 45 second timeout per test") Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200318060004.10685-1-po-hsu.lin@canonical.com
-
Aneesh Kumar K.V authored
H_PAGE_THP_HUGE is used to differentiate between a THP hugepage and hugetlb hugepage entries. The difference is WRT how we handle hash fault on these address. THP address enables MPSS in segments. We want to manage devmap hugepage entries similar to THP pt entries. Hence use H_PAGE_THP_HUGE for devmap huge PTE entries. With current code while handling hash PTE fault, we do set is_thp = true when finding devmap PTE huge PTE entries. Current code also does the below sequence we setting up huge devmap entries. entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pmd_mkdevmap(entry); In that case we would find both H_PAGE_THP_HUGE and PAGE_DEVMAP set for huge devmap PTE entries. This results in false positive error like below. kernel BUG at /home/kvaneesh/src/linux/mm/memory.c:4321! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 56 PID: 67996 Comm: t_mmap_dio Not tainted 5.6.0-rc4-59640-g371c804dedbc #128 .... NIP [c00000000044c9e4] __follow_pte_pmd+0x264/0x900 LR [c0000000005d45f8] dax_writeback_one+0x1a8/0x740 Call Trace: str_spec.74809+0x22ffb4/0x2d116c (unreliable) dax_writeback_one+0x1a8/0x740 dax_writeback_mapping_range+0x26c/0x700 ext4_dax_writepages+0x150/0x5a0 do_writepages+0x68/0x180 __filemap_fdatawrite_range+0x138/0x180 file_write_and_wait_range+0xa4/0x110 ext4_sync_file+0x370/0x6e0 vfs_fsync_range+0x70/0xf0 sys_msync+0x220/0x2e0 system_call+0x5c/0x68 This is because our pmd_trans_huge check doesn't exclude _PAGE_DEVMAP. To make this all consistent, update pmd_mkdevmap to set H_PAGE_THP_HUGE and pmd_trans_huge check now excludes _PAGE_DEVMAP correctly. Fixes: ebd31197 ("powerpc/mm: Add devmap support for ppc64") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200313094842.351830-1-aneesh.kumar@linux.ibm.com
-
Chen Zhou authored
Fixes gcc '-Wunused-but-set-variable' warning: drivers/pci/hotplug/rpaphp_core.c: In function is_php_type: drivers/pci/hotplug/rpaphp_core.c:291:16: warning: variable value set but not used [-Wunused-but-set-variable] Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200312140412.32373-1-chenzhou10@huawei.com
-
Christophe Leroy authored
Reorder Linux PTE bits to (almost) match Hash PTE bits. RW Kernel : PP = 00 RO Kernel : PP = 00 RW User : PP = 01 RO User : PP = 11 So naturally, we should have _PAGE_USER = 0x001 _PAGE_RW = 0x002 Today 0x001 and 0x002 and _PAGE_PRESENT and _PAGE_HASHPTE which both are software only bits. Switch _PAGE_USER and _PAGE_PRESET Switch _PAGE_RW and _PAGE_HASHPTE This allows to remove a few insns. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c4d6c18a7f8d9d3b899bc492f55fbc40ef38896a.1583861325.git.christophe.leroy@c-s.fr
-
Christophe Leroy authored
At the moment kasan_remap_early_shadow_ro() does nothing, because k_end is 0 and k_cur < 0 is always true. Change the test to k_cur != k_end, as done in kasan_init_shadow_page_tables() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Fixes: cbd18991 ("powerpc/mm: Fix an Oops in kasan_mmu_init()") Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4e7b56865e01569058914c991143f5961b5d4719.1583507333.git.christophe.leroy@c-s.fr
-
Christophe Leroy authored
At the time being we have something like if (something) { p = get(); if (p) { if (something_wrong) goto out; ... return; } else if (a != b) { if (some_error) goto out; ... } goto out; } p = get(); if (!p) { if (a != b) { if (some_error) goto out; ... } goto out; } This is similar to p = get(); if (!p) { if (a != b) { if (some_error) goto out; ... } goto out; } if (something) { if (something_wrong) goto out; ... return; } Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Reflow the comment that was moved] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/07a17425743600460ce35fa9432d42487a825583.1582099499.git.christophe.leroy@c-s.fr
-
Michael Ellerman authored
We currently have two section mismatch warnings: The function __boot_from_prom() references the function __init prom_init(). The function start_here_common() references the function __init start_kernel(). The warnings are correct, we do have branches from non-init code into init code, which is freed after boot. But we don't expect to ever execute any of that early boot code after boot, if we did that would be a bug. In particular calling into OF after boot would be fatal because OF is no longer resident. So for now fix the warnings by marking the relevant functions as __REF, which puts them in the ".ref.text" section. This causes some reordering of the functions in the final link: @@ -217,10 +217,9 @@ c00000000000b088 t generic_secondary_common_init c00000000000b124 t __mmu_off c00000000000b14c t __start_initialization_multiplatform -c00000000000b1ac t __boot_from_prom -c00000000000b1ec t __after_prom_start -c00000000000b260 t p_end -c00000000000b27c T copy_and_flush +c00000000000b1ac t __after_prom_start +c00000000000b220 t p_end +c00000000000b23c T copy_and_flush c00000000000b300 T __secondary_start c00000000000b300 t copy_to_here c00000000000b344 t start_secondary_prolog @@ -228,8 +227,9 @@ c00000000000b36c t enable_64b_mode c00000000000b388 T relative_toc c00000000000b3a8 t p_toc -c00000000000b3b0 t start_here_common -c00000000000b3d0 t start_here_multiplatform +c00000000000b3b0 t __boot_from_prom +c00000000000b3f0 t start_here_multiplatform +c00000000000b480 t start_here_common c00000000000b880 T system_call_common c00000000000b974 t system_call c00000000000b9dc t system_call_exit In particular __boot_from_prom moves after copy_to_here, which means it's not copied to zero in the first stage of copy of the kernel to zero. But that's OK, because we only call __boot_from_prom before we do the copy, so it makes no difference when it's copied. The call sequence is: __start -> __start_initialization_multiplatform -> __boot_from_prom -> __start -> __start_initialization_multiplatform -> __after_prom_start -> copy_and_flush -> copy_and_flush (relocated to 0) -> start_here_multiplatform -> early_setup Reported-by: Mauricio Faria de Oliveira <mauricfo@linux.ibm.com> Reported-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225031328.14676-1-mpe@ellerman.id.au
-
Michael Ellerman authored
In xmon we have two variables that are used by the dump commands. There's ndump which is the number of bytes to dump using 'd', and nidump which is the number of instructions to dump using 'di'. ndump starts as 64 and nidump starts as 16, but both can be set by the user. It's fairly common to be pasting addresses into xmon when trying to debug something, and if you inadvertently double paste an address like so: 0:mon> di c000000002101f6c c000000002101f6c The second value is interpreted as the number of instructions to dump. Luckily it doesn't dump 13 quintrillion instructions, the value is limited to MAX_DUMP (128K). But as each instruction is dumped on a single line, that's still a lot of output. If you're on a slow console that can take multiple minutes to print. If you were "just popping in and out of xmon quickly before the RCU/hardlockup detector fires" you are now having a bad day. Things are not as bad with 'd' because we print 16 bytes per line, so it's fewer lines. But it's still quite a lot. So shrink the maximum for 'd' to 64K (one page), which is 4096 lines. For 'di' add a new limit which is the above / 4 - because instructions are 4 bytes, meaning again we can dump one page. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200219110007.31195-1-mpe@ellerman.id.au
-
Alexey Kardashevskiy authored
The "os-term" RTAS calls has one argument with a message address of OS termination cause. rtas_os_term() already passes it but the recently added prom_init's version of that missed it; it also does not fill args correctly. This passes the message address and initializes the number of arguments. Fixes: 6a9c930b ("powerpc/prom_init: Add the ESM call to prom_init") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200312074404.87293-1-aik@ozlabs.ru
-
afzal mohammed authored
request_irq() is preferred over setup_irq(). Invocations of setup_irq() occur after memory allocators are ready. Per tglx[1], setup_irq() existed in olden days when allocators were not ready by the time early interrupts were initialized. Hence replace setup_irq() by request_irq(). [1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanosSigned-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200312064256.18735-1-afzal.mohd.ma@gmail.com
-
Joe Perches authored
Convert the various uses of fallthrough comments to fallthrough; Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/03073a9a269010ca439e9e658629c44602b0cc9f.1583896348.git.joe@perches.com
-
Balamuruhan S authored
ld instruction should have 14 bit immediate field (DS) concatenated with 0b00 on the right, encode it accordingly. Introduce macro `IMM_DS()` to encode DS form instructions with 14 bit immediate field. Fixes: 4ceae137 ("powerpc: emulate_step() tests for load/store instructions") Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Balamuruhan S <bala24@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200311102405.392263-1-bala24@linux.ibm.com
-
Tyrel Datwyler authored
The expectation is that when calling of_read_drc_info_cell() repeatedly to parse multiple drc-info records that the in/out curval parameter points at the start of the next record on return. However, the current behavior has curval still pointing at the final value of the record just parsed. The result of which is that if the ibm,drc-info property contains multiple properties the parsed value of the drc_type for any record after the first has the power_domain value of the previous record appended to the type string. eg: observed the following 0xffffffff prepended to PHB drc-info: type: \xff\xff\xff\xffPHB, prefix: PHB , index_start: 0x20000001 drc-info: suffix_start: 1, sequential_elems: 3072, sequential_inc: 1 drc-info: power-domain: 0xffffffff, last_index: 0x20000c00 In practice PHBs are the only type of connector in the ibm,drc-info property that has multiple records. So, it breaks PHB hotplug, but by chance not PCI, CPU, slot, or memory because they happen to only ever be a single record. Fix by incrementing curval past the power_domain value to point at drc_type string of next record. Fixes: e83636ac ("pseries/drc-info: Search DRC properties for CPU indexes") Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200307024547.5748-1-tyreld@linux.ibm.com
-
Gustavo Luiz Duarte authored
The test case tm-signal-context-force-tm expects a segfault to happen on returning from signal handler, and then does a setcontext() to run the test again. However, the test doesn't always segfault, causing the test to run a single time. This patch fixes the test by putting it within a loop and jumping, via setcontext, just prior to the loop in case it segfaults. This way we get the desired behavior (run the test COUNT_MAX times) regardless if it segfaults or not. This also reduces the use of setcontext for control flow logic, keeping it only in the segfault handler. Also, since 'count' is changed within the signal handler, it is declared as volatile to prevent any compiler optimization getting confused with asynchronous changes. Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200211033831.11165-3-gustavold@linux.ibm.com
-
- 20 Mar, 2020 3 commits
-
-
Gustavo Luiz Duarte authored
This test triggers a TM Bad Thing by raising a signal in transactional state and forcing a pagefault to happen in kernelspace when the kernel signal handling code first touches the user signal stack. This is inspired by the test tm-signal-context-force-tm but uses userfaultfd to make the test deterministic. While this test always triggers the bug in one run, I had to execute tm-signal-context-force-tm several times (the test runs 5000 times each execution) to trigger the same bug. tm-signal-context-force-tm is kept instead of replaced because, while this test is more reliable and triggers the same bug, tm-signal-context-force-tm has a better coverage, in the sense that by running the test several times it might trigger the pagefault and/or be preempted at different places. v3: skip test if userfaultfd is unavailable. Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200211033831.11165-2-gustavold@linux.ibm.com
-
Michael Ellerman authored
Currently you can enable PPC_KUAP_DEBUG when PPC_KUAP is disabled, even though the former has not effect without the latter. Fix it so that PPC_KUAP_DEBUG can only be enabled when PPC_KUAP is enabled, not when the platform could support KUAP (PPC_HAVE_KUAP). Fixes: 890274c2 ("powerpc/64s: Implement KUAP for Radix MMU") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200301111738.22497-1-mpe@ellerman.id.au
-
Michael Ellerman authored
There's two different paths through the sigreturn code, depending on whether the VDSO is mapped or not. We recently discovered a bug in the unmapped case, because it's not commonly used these days. So add a test that sends itself a signal, then moves the VDSO, takes another signal and finally unmaps the VDSO before sending itself another signal. That tests the standard signal path, the code that handles the VDSO being moved, and also the signal path in the case where the VDSO is unmapped. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200304110402.6038-1-mpe@ellerman.id.au
-
- 17 Mar, 2020 5 commits
-
-
Nicholas Piggin authored
We should be checking that the instruction was stepped *and* that the target register has the right value. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> [mpe: Write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200226055302.1577954-1-npiggin@gmail.com
-
Nicholas Piggin authored
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200302010410.2957362-1-npiggin@gmail.com
-
Christophe Leroy authored
The commit identified below added tlbie_test but forgot to add it in .gitignore. Fixes: 93cad5f7 ("selftests/powerpc: Add test case for tlbie vs mtpidr ordering issue") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/259f9c06ed4563c4fa4fa8ffa652347278d769e7.1582847784.git.christophe.leroy@c-s.fr
-
YueHaibing authored
core99_l2_cache/core99_l3_cache do not need to be marked as volatile, remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200303085604.24952-1-yuehaibing@huawei.com
-
Ilie Halip authored
When building with ppc64_defconfig, the compiler reports that these 2 variables are not used: warning: unused variable 'core99_l2_cache' [-Wunused-variable] warning: unused variable 'core99_l3_cache' [-Wunused-variable] They are only used when CONFIG_PPC64 is not defined. Move them into a section which does the same macro check. Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Ilie Halip <ilie.halip@gmail.com> [mpe: Move them into core99_init_caches() which is their only user] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190920153951.25762-1-ilie.halip@gmail.com
-