- 13 Aug, 2013 7 commits
-
-
Stephen Warren authored
Tegra20 HW appears to have a bug such that PCIe device interrupts, whether they are legacy IRQs or MSI, are lost when LP2 is enabled. To work around this, simply disable LP2 if any PCIe devices with interrupts are present. Detect this via the IRQ domain map operation. This is slightly over-conservative; if a device with an interrupt is present but the driver does not actually use them, LP2 will still be disabled. However, this is a reasonable trade-off which enables a simpler workaround. Signed-off-by: Stephen Warren <swarren@nvidia.com> Tested-by: Thierry Reding <treding@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com>
-
Thierry Reding authored
I'll be taking on maintainership of the Tegra PCIe driver since it's now moved out of arch/arm/mach-tegra. Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Stephen Warren authored
The registers PADS_REFCLK_CFG are an array of 16-bit data, one entry per PCIe root port. For Tegra30, we therefore need to write a 3rd entry in this array. Doing so makes the mini-PCIe slot on Beaver operate correctly. While we're at it, add some #defines to partially document the fields within these 16-bit values. Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Jay Agarwal authored
Introduce a data structure to parameterize the driver according to SoC generation, add Tegra30 specific code and update the device tree binding document for Tegra30 support. Signed-off-by: Jay Agarwal <jagarwal@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Thierry Reding authored
Move the PCIe driver from arch/arm/mach-tegra into the drivers/pci/host directory. The motivation is to collect various host controller drivers in the same location in order to facilitate refactoring. The Tegra PCIe driver has been largely rewritten, both in order to turn it into a proper platform driver and to add MSI (based on code by Krishna Kishore <kthota@nvidia.com>) as well as device tree support. Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> [swarren, split DT changes into a separate patch in another branch] Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Stephen Warren authored
pci msi changes for v3.12 (round 2) - fix build breakage for s390 allyesconfig due to !HAVE_GENERIC_HARDIRQS
-
Thomas Petazzoni authored
Some platforms (e.g S390) don't use the generic hardirqs code and therefore do not defined HAVE_GENERIC_HARDIRQS. This prevents using the irq_set_chip_data() and irq_get_chip_data() functions that are used for the default implementations of the MSI operations. So, when CONFIG_GENERIC_HARDIRQS is not enabled, provide another default implementation of the MSI operations, that simply errors out. The architecture is responsible for implementing those operations (which is the case on S390), and cannot use the msi_chip infrastructure. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
-
- 12 Aug, 2013 12 commits
-
-
Joseph Lo authored
The LP1 suspend mode will power off the CPU, clock gated the PLLs and put SDRAM to self-refresh mode. Any interrupt can wake up device from LP1. The sequence when LP1 suspending: * tunning off L1 data cache and the MMU * storing some EMC registers, DPD (deep power down) status, clk source of mselect and SCLK burst policy * putting SDRAM into self-refresh * switching CPU to CLK_M (12MHz OSC) * tunning off PLLM, PLLP, PLLA, PLLC and PLLX * switching SCLK to CLK_S (32KHz OSC) * shutting off the CPU rail The sequence of LP1 resuming: * re-enabling PLLM, PLLP, PLLA, PLLC and PLLX * restoring the clk source of mselect and SCLK burst policy * setting up CCLK burst policy to PLLX * restoring DPD status and some EMC registers * resuming SDRAM to normal mode * jumping to the "tegra_resume" from PMC_SCRATCH41 Due to the SDRAM will be put into self-refresh mode, the low level procedures of LP1 suspending and resuming should be copied to TEGRA_IRAM_CODE_AREA (TEGRA_IRAM_BASE + SZ_4K) when suspending. Before restoring the CPU context when resuming, the SDRAM needs to be switched back to normal mode. And the PLLs need to be re-enabled, SCLK burst policy be restored. Then jumping to "tegra_resume" that was expected to be stored in PMC_SCRATCH41 to restore CPU context and back to kernel. Based on the work by: Bo Yan <byan@nvidia.com> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
The LP1 suspend mode will power off the CPU, clock gated the PLLs and put SDRAM to self-refresh mode. Any interrupt can wake up device from LP1. The sequence when LP1 suspending: * tunning off L1 data cache and the MMU * putting SDRAM into self-refresh * storing some EMC registers and SCLK burst policy * switching CPU to CLK_M (12MHz OSC) * switching SCLK to CLK_S (32KHz OSC) * tunning off PLLM, PLLP and PLLC * shutting off the CPU rail The sequence of LP1 resuming: * re-enabling PLLM, PLLP, and PLLC * restoring some EMC registers and SCLK burst policy * setting up CCLK burst policy to PLLP * resuming SDRAM to normal mode * jumping to the "tegra_resume" from PMC_SCRATCH41 Due to the SDRAM will be put into self-refresh mode, the low level procedures of LP1 suspending and resuming should be copied to TEGRA_IRAM_CODE_AREA (TEGRA_IRAM_BASE + SZ_4K) when suspending. Before restoring the CPU context when resuming, the SDRAM needs to be switched back to normal mode. And the PLLs need to be re-enabled, SCLK burst policy be restored, CCLK burst policy be set in PLLP. Then jumping to "tegra_resume" that was expected to be stored in PMC_SCRATCH41 to restore CPU context and back to kernel. Based on the work by: Colin Cross <ccross@android.com> Gary King <gking@nvidia.com> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
The LP1 suspend mode will power off the CPU, clock gated the PLLs and put SDRAM to self-refresh mode. Any interrupt can wake up device from LP1. The sequence when LP1 suspending: * tunning off L1 data cache and the MMU * storing some EMC registers, DPD (deep power down) status, clk source of mselect and SCLK burst policy * putting SDRAM into self-refresh * switching CPU to CLK_M (12MHz OSC) * tunning off PLLM, PLLP, PLLA, PLLC and PLLX * switching SCLK to CLK_S (32KHz OSC) * shutting off the CPU rail The sequence of LP1 resuming: * re-enabling PLLM, PLLP, PLLA, PLLC and PLLX * restoring the clk source of mselect and SCLK burst policy * setting up CCLK burst policy to PLLX * restoring DPD status and some EMC registers * resuming SDRAM to normal mode * jumping to the "tegra_resume" from PMC_SCRATCH41 Due to the SDRAM will be put into self-refresh mode, the low level procedures of LP1 suspending and resuming should be copied to TEGRA_IRAM_CODE_AREA (TEGRA_IRAM_BASE + SZ_4K) when suspending. Before restoring the CPU context when resuming, the SDRAM needs to be switched back to normal mode. And the PLLs need to be re-enabled, SCLK burst policy be restored, CCLK burst policy be set in PLLX. Then jumping to "tegra_resume" that was expected to be stored in PMC_SCRATCH41 to restore CPU context and back to kernel. Based on the work by: Scott Williams <scwilliams@nvidia.com> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
The LP1 suspending mode on Tegra means CPU rail off, devices and PLLs are clock gated and SDRAM in self-refresh mode. That means the low level LP1 suspending and resuming code couldn't be run on DRAM and the CPU must switch to the always on clock domain (a.k.a. CLK_M 12MHz oscillator). And the system clock (SCLK) would be switched to CLK_S, a 32KHz oscillator. The LP1 low level handling code need to be moved to IRAM area first. And marking the LP1 mask for indicating the Tegra device is in LP1. The CPU power timer needs to be re-calculated based on 32KHz that was originally based on PCLK. When resuming from LP1, the LP1 reset handler will resume PLLs and then put DRAM to normal mode. Then jumping to the "tegra_resume" that will restore full context before back to kernel. The "tegra_resume" handler was expected to be found in PMC_SCRATCH41 register. This is common LP1 procedures for Tegra, so we do these jobs mainly in this patch: * moving LP1 low level handling code to IRAM * marking LP1 mask * copying the physical address of "tegra_resume" to PMC_SCRATCH41 * re-calculate the CPU power timer based on 32KHz Signed-off-by: Joseph Lo <josephl@nvidia.com> [swarren, replaced IRAM_CODE macro with IO_ADDRESS(TEGRA_IRAM_CODE_AREA)] Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
When the system suspends to LP1, the CPU clock source is switched to CLK_M (12MHz Oscillator) during suspend/resume flow. The CPU clock source is controlled by the CCLKG_BURST_POLICY register, and hence this register must be restored during LP1 resume. Cc: Mike Turquette <mturquette@linaro.org> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
When suspending to LP1 mode, the SYSCLK will be clock gated. And different board may have different polarity of the request of SYSCLK, this patch configure the polarity from the DT for the board. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
Add support to the Tegra CPU reset vector to detect whether the CPU is resuming from LP1 suspend state. If it is, branch to the LP1-specific resume code. When Tegra enters the LP1 suspend state, the SDRAM controller is placed into a self-refresh state. For this reason, we must place the LP1 resume code into IRAM, so that it is accessible before SDRAM access has been re-enabled. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Thomas Petazzoni authored
Some PCI drivers may need to adjust the pci_bus structure after it has been allocated by the Linux PCI core. The PCI core allows architectures to implement the pcibios_add_bus() and pcibios_remove_bus() for this purpose. This commit therefore extends the hw_pci and pci_sys_data structures of the ARM PCI core to allow PCI drivers to register ->add_bus() and ->remove_bus() in hw_pci, which will get called when a bus is added or removed from the system. This will be used for example by the Marvell PCIe driver to connect a particular PCI bus with its corresponding MSI chip to handle Message Signaled Interrupts. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Reviewed-by: Thierry Reding <thierry.reding@gmail.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
-
Thomas Petazzoni authored
This commit adds a very basic registry of msi_chip structures, so that an IRQ controller driver can register an msi_chip, and a PCIe host controller can find it, based on a 'struct device_node'. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
-
Thierry Reding authored
The new struct msi_chip is used to associated an MSI controller with a PCI bus. It is automatically handed down from the root to its children during bus enumeration. This patch provides default (weak) implementations for the architecture- specific MSI functions (arch_setup_msi_irq(), arch_teardown_msi_irq() and arch_msi_check_device()) which check if a PCI device's bus has an attached MSI chip and forward the call appropriately. Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
-
Thomas Petazzoni authored
Now that we have weak versions for each of the PCI MSI architecture functions, we can actually build the MSI support for all platforms, regardless of whether they provide or not architecture-specific versions of those functions. For this reason, the ARCH_SUPPORTS_MSI hidden kconfig boolean becomes useless, and this patch gets rid of it. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: linux-s390@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
-
Thomas Petazzoni authored
Until now, the MSI architecture-specific functions could be overloaded using a fairly complex set of #define and compile-time conditionals. In order to prepare for the introduction of the msi_chip infrastructure, it is desirable to switch all those functions to use the 'weak' mechanism. This commit converts all the architectures that were overidding those MSI functions to use the new strategy. Note that we keep two separate, non-weak, functions default_teardown_msi_irqs() and default_restore_msi_irqs() for the default behavior of the arch_teardown_msi_irqs() and arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI code. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: linux-s390@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
-
- 08 Aug, 2013 1 commit
-
-
Stephen Warren authored
Move all common select clauses from ARCH_TEGRA_*_SOC to ARCH_TEGRA to eliminate duplication. The USB-related selects all should have been common too, but were missing from Tegra114 previously. Move these to ARCH_TEGRA too. The latter fixes a build break when only Tegra114 support was enabled, but not Tegra20 or Tegra30 support. Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
- 19 Jul, 2013 13 commits
-
-
Joseph Lo authored
The Tegra114 can support suspend function now, removing the limitation. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
Adding suspend/resume function for tegra_cpu_car_ops. We only save and restore the setting of the clock of CoreSight. Other clocks still need to be taken care by clock driver. Cc: Mike Turquette <mturquette@linaro.org> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
The flow controller can help CPU to go into suspend mode (powered-down state). When CPU goes into powered-down state, it needs some careful settings before getting into and after leaving. The enter and exit functions do that by configuring appropriate mode for flow controller. For Tegra114, the setting is compatible with Tegra30. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
Hooking tegra_tear_down_cpu for Tegra114 for supporting cluster power down when CPU cluster suspneded in LP2. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
When the last CPU core in suspend, the CPU power rail can be turned off by setting flags to flow controller. Then the flow controller will inform PMC to turn off the CPU rail when the last CPU goes into suspend. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
When the CPU cluster power down, the vGIC is powered down too. The flow controller needs to monitor the legacy interrupt controller to wake up CPU. So setting up the appropriate wake up event in flow controller. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
When there is a cluster power down cycle in suspend, we need to set up the correct L2 RAM data RAM latency to make L2 cache work correctly. This is only needed for cluster 0 and needs to be done in tegra_resume before the cache is enabled. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
Adding a flag for tegra_disable_clean_inv_dcache to flush cache as LoUIS or ALL. After this patch, the v7_flush_dcache_louis is used for CPU hotplug and CPU suspend in CPU power down (e.g. CPU idle power-down mode) case. And the v7_flush_dcache_all is used for CPU cluster power down (e.g. suspend to LP2 mode). Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
The v7_invalidate_l1 was used for the L1 cache that come out from reset in a undefined state. This is no need for Cortex-A15. We do it for A9 only. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
This supports CPU core power down on each CPU when CPU idle. When CPU go into this state, it saves it's context and needs a proper configuration in flow controller to power gate the CPU when CPU runs into WFI instruction. And the CPU also needs to set the IRQ as CPU power down idle wake up event in flow controller. Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
The flow controller would take care the power sequence when CPU idle in powered-down mode. It powered gate the CPU when CPU runs into WFI instruction. And wake up the CPU when event be triggered. The sequence is below. * setting wfi bitmap for the CPU as the halt event in the FLOW_CTRL_CPU_HALT_REG to monitor the CPU running into WFI,then power gate it * setting IRQ and FIQ as wake up event to wake up CPU when event triggered Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
There is a difference between GICv1 and v2 when CPU in power management mode (aka CPU power down on Tegra). For GICv1, IRQ/FIQ interrupt lines going to CPU are same lines which are also used for wake-interrupt. Therefore, we cannot disable the GIC CPU interface if we need to use same interrupts for CPU wake purpose. This creates a race condition for CPU power off entry. Also, in GICv1, disabling GICv1 CPU interface puts GICv1 into bypass mode such that incoming legacy IRQ/FIQ are sent to CPU, which means disabling GIC CPU interface doesn't really disable IRQ/FIQ to CPU. GICv2 provides a wake IRQ/FIQ (for wake-event purpose), which are not disabled by GIC CPU interface. This is done by adding a bypass override capability when the interrupts are disabled at the CPU interface. To support this, there are four bits about IRQ/FIQ BypassDisable in CPU interface Control Register. When the IRQ/FIQ not being driver by the CPU interface, each interrupt output signal can be deasserted rather than being driven by the legacy interrupt input. So the wake-event can be used as wakeup signals to SoC (system power controller). To prevent race conditions and ensure proper interrupt routing on Cortex-A15 CPUs when they are power-gated, add a CPU PM notifier call-back to reprogram the GIC CPU interface on PM entry. The GIC CPU interface will be reset back to its normal state by the common GIC CPU PM exit callback when the CPU wakes up. Based on the work by: Scott Williams <scwilliams@nvidia.com> Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
Joseph Lo authored
This reverts commit 510bb595 "ARM: tegra: add cpu_disable for hotplug". The Tegra114 support CPU0 hotplug function in HW physically, but it needs other software to make it work normally after we add CPU idle power down mode support. So remove them for now. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
- 15 Jul, 2013 1 commit
-
-
Joseph Lo authored
The commit 93dc6887 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)) introduced a workaround for Cortex-A15 erratum 798181. Enable it for Tegra114. Signed-off-by: Joseph Lo <josephl@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com>
-
- 14 Jul, 2013 6 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linuxLinus Torvalds authored
Pull slab update from Pekka Enberg: "Highlights: - Fix for boot-time problems on some architectures due to init_lock_keys() not respecting kmalloc_caches boundaries (Christoph Lameter) - CONFIG_SLUB_CPU_PARTIAL requested by RT folks (Joonsoo Kim) - Fix for excessive slab freelist draining (Wanpeng Li) - SLUB and SLOB cleanups and fixes (various people)" I ended up editing the branch, and this avoids two commits at the end that were immediately reverted, and I instead just applied the oneliner fix in between myself. * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux slub: Check for page NULL before doing the node_match check mm/slab: Give s_next and s_stop slab-specific names slob: Check for NULL pointer before calling ctor() slub: Make cpu partial slab support configurable slab: add kmalloc() to kernel API documentation slab: fix init_lock_keys slob: use DIV_ROUND_UP where possible slub: do not put a slab to cpu partial list when cpu_partial is 0 mm/slub: Use node_nr_slabs and node_nr_objs in get_slabinfo mm/slub: Drop unnecessary nr_partials mm/slab: Fix /proc/slabinfo unwriteable for slab mm/slab: Sharing s_next and s_stop between slab and slub mm/slab: Fix drain freelist excessively slob: Rework #ifdeffery in slab.h mm, slab: moved kmem_cache_alloc_node comment to correct place
-
Steven Rostedt authored
In the -rt kernel (mrg), we hit the following dump: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 PGD a2d39067 PUD b1641067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: sunrpc cpufreq_ondemand ipv6 tg3 joydev sg serio_raw pcspkr k8temp amd64_edac_mod edac_core i2c_piix4 e100 mii shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw ata_generic pata_acpi pata_serverworks radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU 3 Pid: 20878, comm: hackbench Not tainted 3.6.11-rt25.14.el6rt.x86_64 #1 empty empty/Tyan Transport GT24-B3992 RIP: 0010:[<ffffffff811573f1>] [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 RSP: 0018:ffff8800a9b17d70 EFLAGS: 00010213 RAX: 0000000000000000 RBX: 0000000001200011 RCX: ffff8800a06d8000 RDX: 0000000004d92a03 RSI: 00000000000000d0 RDI: ffff88013b805500 RBP: ffff8800a9b17dc0 R08: ffff88023fd14d10 R09: ffffffff81041cbd R10: 00007f4e3f06e9d0 R11: 0000000000000246 R12: ffff88013b805500 R13: ffff8801ff46af40 R14: 0000000000000001 R15: 0000000000000000 FS: 00007f4e3f06e700(0000) GS:ffff88023fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000a2d3a000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process hackbench (pid: 20878, threadinfo ffff8800a9b16000, task ffff8800a06d8000) Stack: ffff8800a9b17da0 ffffffff81202e08 ffff8800a9b17de0 000000d001200011 0000000001200011 0000000001200011 0000000000000000 0000000000000000 00007f4e3f06e9d0 0000000000000000 ffff8800a9b17e60 ffffffff81041cbd Call Trace: [<ffffffff81202e08>] ? current_has_perm+0x68/0x80 [<ffffffff81041cbd>] copy_process+0xdd/0x15b0 [<ffffffff810a2125>] ? rt_up_read+0x25/0x30 [<ffffffff8104369a>] do_fork+0x5a/0x360 [<ffffffff8107c66b>] ? migrate_enable+0xeb/0x220 [<ffffffff8100b068>] sys_clone+0x28/0x30 [<ffffffff81527423>] stub_clone+0x13/0x20 [<ffffffff81527152>] ? system_call_fastpath+0x16/0x1b Code: 89 fc 89 75 cc 41 89 d6 4d 8b 04 24 65 4c 03 04 25 48 ae 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 12 41 83 fe ff 74 27 <48> 8b 00 48 c1 e8 3a 41 39 c6 74 1b 8b 75 cc 4c 89 c9 44 89 f2 RIP [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180 RSP <ffff8800a9b17d70> CR2: 0000000000000000 ---[ end trace 0000000000000002 ]--- Now, this uses SLUB pretty much unmodified, but as it is the -rt kernel with CONFIG_PREEMPT_RT set, spinlocks are mutexes, although they do disable migration. But the SLUB code is relatively lockless, and the spin_locks there are raw_spin_locks (not converted to mutexes), thus I believe this bug can happen in mainline without -rt features. The -rt patch is just good at triggering mainline bugs ;-) Anyway, looking at where this crashed, it seems that the page variable can be NULL when passed to the node_match() function (which does not check if it is NULL). When this happens we get the above panic. As page is only used in slab_alloc() to check if the node matches, if it's NULL I'm assuming that we can say it doesn't and call the __slab_alloc() code. Is this a correct assumption? Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull more vfs stuff from Al Viro: "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including making simple_lookup() usable for filesystems with non-NULL s_d_op, which allows us to get rid of quite a bit of ugliness" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sunrpc: now we can just set ->s_d_op cgroup: we can use simple_lookup() now efivarfs: we can use simple_lookup() now make simple_lookup() usable for filesystems that set ->s_d_op configfs: don't open-code d_alloc_name() __rpc_lookup_create_exclusive: pass string instead of qstr rpc_create_*_dir: don't bother with qstr llist: llist_add() can use llist_add_batch() llist: fix/simplify llist_add() and llist_add_batch() fput: turn "list_head delayed_fput_list" into llist_head fs/file_table.c:fput(): add comment Safer ABI for O_TMPFILE
-
Al Viro authored
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-