Commit f212ec4b authored by Bernhard Kaindl's avatar Bernhard Kaindl Committed by Ingo Molnar

x86: early boot debugging via FireWire (ohci1394_dma=early)

This patch adds a new configuration option, which adds support for a new
early_param which gets checked in arch/x86/kernel/setup_{32,64}.c:setup_arch()
to decide wether OHCI-1394 FireWire controllers should be initialized and
enabled for physical DMA access to allow remote debugging of early problems
like issues ACPI or other subsystems which are executed very early.

If the config option is not enabled, no code is changed, and if the boot
paramenter is not given, no new code is executed, and independent of that,
all new code is freed after boot, so the config option can be even enabled
in standard, non-debug kernels.

With specialized tools, it is then possible to get debugging information
from machines which have no serial ports (notebooks) such as the printk
buffer contents, or any data which can be referenced from global pointers,
if it is stored below the 4GB limit and even memory dumps of of the physical
RAM region below the 4GB limit can be taken without any cooperation from the
CPU of the host, so the machine can be crashed early, it does not matter.

In the extreme, even kernel debuggers can be accessed in this way. I wrote
a small kgdb module and an accompanying gdb stub for FireWire which allows
to gdb to talk to kgdb using remote remory reads and writes over FireWire.

An version of the gdb stub fore FireWire is able to read all global data
from a system which is running a a normal kernel without any kernel debugger,
without any interruption or support of the system's CPU. That way, e.g. the
task struct and so on can be read and even manipulated when the physical DMA
access is granted.

A HOWTO is included in this patch, in Documentation/debugging-via-ohci1394.txt
and I've put a copy online at
ftp://ftp.suse.de/private/bk/firewire/docs/debugging-via-ohci1394.txt

It also has links to all the tools which are available to make use of it
another copy of it is online at:
ftp://ftp.suse.de/private/bk/firewire/kernel/ohci1394_dma_early-v2.diffSigned-Off-By: default avatarBernhard Kaindl <bk@suse.de>
Tested-By: default avatarThomas Renninger <trenn@suse.de>
Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
parent 6194ba6f
Using physical DMA provided by OHCI-1394 FireWire controllers for debugging
---------------------------------------------------------------------------
Introduction
------------
Basically all FireWire controllers which are in use today are compliant
to the OHCI-1394 specification which defines the controller to be a PCI
bus master which uses DMA to offload data transfers from the CPU and has
a "Physical Response Unit" which executes specific requests by employing
PCI-Bus master DMA after applying filters defined by the OHCI-1394 driver.
Once properly configured, remote machines can send these requests to
ask the OHCI-1394 controller to perform read and write requests on
physical system memory and, for read requests, send the result of
the physical memory read back to the requester.
With that, it is possible to debug issues by reading interesting memory
locations such as buffers like the printk buffer or the process table.
Retrieving a full system memory dump is also possible over the FireWire,
using data transfer rates in the order of 10MB/s or more.
Memory access is currently limited to the low 4G of physical address
space which can be a problem on IA64 machines where memory is located
mostly above that limit, but it is rarely a problem on more common
hardware such as hardware based on x86, x86-64 and PowerPC.
Together with a early initialization of the OHCI-1394 controller for debugging,
this facility proved most useful for examining long debugs logs in the printk
buffer on to debug early boot problems in areas like ACPI where the system
fails to boot and other means for debugging (serial port) are either not
available (notebooks) or too slow for extensive debug information (like ACPI).
Drivers
-------
The OHCI-1394 drivers in drivers/firewire and drivers/ieee1394 initialize
the OHCI-1394 controllers to a working state and can be used to enable
physical DMA. By default you only have to load the driver, and physical
DMA access will be granted to all remote nodes, but it can be turned off
when using the ohci1394 driver.
Because these drivers depend on the PCI enumeration to be completed, an
initialization routine which can runs pretty early (long before console_init(),
which makes the printk buffer appear on the console can be called) was written.
To activate it, enable CONFIG_PROVIDE_OHCI1394_DMA_INIT (Kernel hacking menu:
Provide code for enabling DMA over FireWire early on boot) and pass the
parameter "ohci1394_dma=early" to the recompiled kernel on boot.
Tools
-----
firescope - Originally developed by Benjamin Herrenschmidt, Andi Kleen ported
it from PowerPC to x86 and x86_64 and added functionality, firescope can now
be used to view the printk buffer of a remote machine, even with live update.
Bernhard Kaindl enhanced firescope to support accessing 64-bit machines
from 32-bit firescope and vice versa:
- ftp://ftp.suse.de/private/bk/firewire/tools/firescope-0.2.2.tar.bz2
and he implemented fast system dump (alpha version - read README.txt):
- ftp://ftp.suse.de/private/bk/firewire/tools/firedump-0.1.tar.bz2
There is also a gdb proxy for firewire which allows to use gdb to access
data which can be referenced from symbols found by gdb in vmlinux:
- ftp://ftp.suse.de/private/bk/firewire/tools/fireproxy-0.33.tar.bz2
The latest version of this gdb proxy (fireproxy-0.34) can communicate (not
yet stable) with kgdb over an memory-based communication module (kgdbom).
Getting Started
---------------
The OHCI-1394 specification regulates that the OHCI-1394 controller must
disable all physical DMA on each bus reset.
This means that if you want to debug an issue in a system state where
interrupts are disabled and where no polling of the OHCI-1394 controller
for bus resets takes place, you have to establish any FireWire cable
connections and fully initialize all FireWire hardware __before__ the
system enters such state.
Step-by-step instructions for using firescope with early OHCI initialization:
1) Verify that your hardware is supported:
Load the ohci1394 or the fw-ohci module and check your kernel logs.
You should see a line similar to
ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[18] MMIO=[fe9ff800-fe9fffff]
... Max Packet=[2048] IR/IT contexts=[4/8]
when loading the driver. If you have no supported controller, many PCI,
CardBus and even some Express cards which are fully compliant to OHCI-1394
specification are available. If it requires no driver for Windows operating
systems, it most likely is. Only specialized shops have cards which are not
compliant, they are based on TI PCILynx chips and require drivers for Win-
dows operating systems.
2) Establish a working FireWire cable connection:
Any FireWire cable, as long at it provides electrically and mechanically
stable connection and has matching connectors (there are small 4-pin and
large 6-pin FireWire ports) will do.
If an driver is running on both machines you should see a line like
ieee1394: Node added: ID:BUS[0-01:1023] GUID[0090270001b84bba]
on both machines in the kernel log when the cable is plugged in
and connects the two machines.
3) Test physical DMA using firescope:
On the debug host,
- load the raw1394 module,
- make sure that /dev/raw1394 is accessible,
then start firescope:
$ firescope
Port 0 (ohci1394) opened, 2 nodes detected
FireScope
---------
Target : <unspecified>
Gen : 1
[Ctrl-T] choose target
[Ctrl-H] this menu
[Ctrl-Q] quit
------> Press Ctrl-T now, the output should be similar to:
2 nodes available, local node is: 0
0: ffc0, uuid: 00000000 00000000 [LOCAL]
1: ffc1, uuid: 00279000 ba4bb801
Besides the [LOCAL] node, it must show another node without error message.
4) Prepare for debugging with early OHCI-1394 initialization:
4.1) Kernel compilation and installation on debug target
Compile the kernel to be debugged with CONFIG_PROVIDE_OHCI1394_DMA_INIT
(Kernel hacking: Provide code for enabling DMA over FireWire early on boot)
enabled and install it on the machine to be debugged (debug target).
4.2) Transfer the System.map of the debugged kernel to the debug host
Copy the System.map of the kernel be debugged to the debug host (the host
which is connected to the debugged machine over the FireWire cable).
5) Retrieving the printk buffer contents:
With the FireWire cable connected, the OHCI-1394 driver on the debugging
host loaded, reboot the debugged machine, booting the kernel which has
CONFIG_PROVIDE_OHCI1394_DMA_INIT enabled, with the option ohci1394_dma=early.
Then, on the debugging host, run firescope, for example by using -A:
firescope -A System.map-of-debug-target-kernel
Note: -A automatically attaches to the first non-local node. It only works
reliably if only connected two machines are connected using FireWire.
After having attached to the debug target, press Ctrl-D to view the
complete printk buffer or Ctrl-U to enter auto update mode and get an
updated live view of recent kernel messages logged on the debug target.
Call "firescope -h" to get more information on firescope's options.
Notes
-----
Documentation and specifications: ftp://ftp.suse.de/private/bk/firewire/docs
FireWire is a trademark of Apple Inc. - for more information please refer to:
http://en.wikipedia.org/wiki/FireWire
......@@ -45,6 +45,7 @@
#include <linux/dmi.h>
#include <linux/pfn.h>
#include <linux/pci.h>
#include <linux/init_ohci1394_dma.h>
#include <video/edid.h>
......@@ -787,6 +788,16 @@ void __init setup_arch(char **cmdline_p)
smp_alloc_memory(); /* AP processor realmode stacks in low memory*/
#endif
paging_init();
/*
* NOTE: On x86-32, only from this point on, fixmaps are ready for use.
*/
#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
if (init_ohci1394_dma_early)
init_ohci1394_dma_on_all_controllers();
#endif
remapped_pgdat_init();
sparse_init();
zone_sizes_init();
......
......@@ -41,6 +41,7 @@
#include <linux/dma-mapping.h>
#include <linux/ctype.h>
#include <linux/uaccess.h>
#include <linux/init_ohci1394_dma.h>
#include <asm/mtrr.h>
#include <asm/uaccess.h>
......@@ -253,6 +254,11 @@ void __attribute__((weak)) __init memory_setup(void)
machine_specific_memory_setup();
}
/*
* setup_arch - architecture-specific boot-time initializations
*
* Note: On x86_64, fixmaps are ready for use even before this is called.
*/
void __init setup_arch(char **cmdline_p)
{
unsigned i;
......@@ -302,6 +308,11 @@ void __init setup_arch(char **cmdline_p)
parse_early_param();
#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
if (init_ohci1394_dma_early)
init_ohci1394_dma_on_all_controllers();
#endif
finish_e820_parsing();
early_gart_iommu_check();
......
......@@ -38,7 +38,7 @@ obj-$(CONFIG_SCSI) += scsi/
obj-$(CONFIG_ATA) += ata/
obj-$(CONFIG_FUSION) += message/
obj-$(CONFIG_FIREWIRE) += firewire/
obj-$(CONFIG_IEEE1394) += ieee1394/
obj-y += ieee1394/
obj-$(CONFIG_UIO) += uio/
obj-y += cdrom/
obj-y += auxdisplay/
......
......@@ -15,3 +15,4 @@ obj-$(CONFIG_IEEE1394_SBP2) += sbp2.o
obj-$(CONFIG_IEEE1394_DV1394) += dv1394.o
obj-$(CONFIG_IEEE1394_ETH1394) += eth1394.o
obj-$(CONFIG_PROVIDE_OHCI1394_DMA_INIT) += init_ohci1394_dma.o
/*
* init_ohci1394_dma.c - Initializes physical DMA on all OHCI 1394 controllers
*
* Copyright (C) 2006-2007 Bernhard Kaindl <bk@suse.de>
*
* Derived from drivers/ieee1394/ohci1394.c and arch/x86/kernel/early-quirks.c
* this file has functions to:
* - scan the PCI very early on boot for all OHCI 1394-compliant controllers
* - reset and initialize them and make them join the IEEE1394 bus and
* - enable physical DMA on them to allow remote debugging
*
* All code and data is marked as __init and __initdata, respective as
* during boot, all OHCI1394 controllers may be claimed by the firewire
* stack and at this point, this code should not touch them anymore.
*
* To use physical DMA after the initialization of the firewire stack,
* be sure that the stack enables it and (re-)attach after the bus reset
* which may be caused by the firewire stack initialization.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software Foundation,
* Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
*/
#include <linux/interrupt.h> /* for ohci1394.h */
#include <linux/delay.h>
#include <linux/pci.h> /* for PCI defines */
#include <linux/init_ohci1394_dma.h>
#include <asm/pci-direct.h> /* for direct PCI config space access */
#include <asm/fixmap.h>
#include "ieee1394_types.h"
#include "ohci1394.h"
int __initdata init_ohci1394_dma_early;
/* Reads a PHY register of an OHCI-1394 controller */
static inline u8 __init get_phy_reg(struct ti_ohci *ohci, u8 addr)
{
int i;
quadlet_t r;
reg_write(ohci, OHCI1394_PhyControl, (addr << 8) | 0x00008000);
for (i = 0; i < OHCI_LOOP_COUNT; i++) {
if (reg_read(ohci, OHCI1394_PhyControl) & 0x80000000)
break;
mdelay(1);
}
r = reg_read(ohci, OHCI1394_PhyControl);
return (r & 0x00ff0000) >> 16;
}
/* Writes to a PHY register of an OHCI-1394 controller */
static inline void __init set_phy_reg(struct ti_ohci *ohci, u8 addr, u8 data)
{
int i;
reg_write(ohci, OHCI1394_PhyControl, (addr << 8) | data | 0x00004000);
for (i = 0; i < OHCI_LOOP_COUNT; i++) {
u32 r = reg_read(ohci, OHCI1394_PhyControl);
if (!(r & 0x00004000))
break;
mdelay(1);
}
}
/* Resets an OHCI-1394 controller (for sane state before initialization) */
static inline void __init init_ohci1394_soft_reset(struct ti_ohci *ohci) {
int i;
reg_write(ohci, OHCI1394_HCControlSet, OHCI1394_HCControl_softReset);
for (i = 0; i < OHCI_LOOP_COUNT; i++) {
if (!(reg_read(ohci, OHCI1394_HCControlSet)
& OHCI1394_HCControl_softReset))
break;
mdelay(1);
}
}
/* Basic OHCI-1394 register and port inititalization */
static inline void __init init_ohci1394_initialize(struct ti_ohci *ohci)
{
quadlet_t bus_options;
int num_ports, i;
/* Put some defaults to these undefined bus options */
bus_options = reg_read(ohci, OHCI1394_BusOptions);
bus_options |= 0x60000000; /* Enable CMC and ISC */
bus_options &= ~0x00ff0000; /* XXX: Set cyc_clk_acc to zero for now */
bus_options &= ~0x18000000; /* Disable PMC and BMC */
reg_write(ohci, OHCI1394_BusOptions, bus_options);
/* Set the bus number */
reg_write(ohci, OHCI1394_NodeID, 0x0000ffc0);
/* Enable posted writes */
reg_write(ohci, OHCI1394_HCControlSet,
OHCI1394_HCControl_postedWriteEnable);
/* Clear link control register */
reg_write(ohci, OHCI1394_LinkControlClear, 0xffffffff);
/* enable phys */
reg_write(ohci, OHCI1394_LinkControlSet,
OHCI1394_LinkControl_RcvPhyPkt);
/* Don't accept phy packets into AR request context */
reg_write(ohci, OHCI1394_LinkControlClear, 0x00000400);
/* Clear the Isochonouys interrupt masks */
reg_write(ohci, OHCI1394_IsoRecvIntMaskClear, 0xffffffff);
reg_write(ohci, OHCI1394_IsoRecvIntEventClear, 0xffffffff);
reg_write(ohci, OHCI1394_IsoXmitIntMaskClear, 0xffffffff);
reg_write(ohci, OHCI1394_IsoXmitIntEventClear, 0xffffffff);
/* Accept asyncronous transfer requests from all nodes for now */
reg_write(ohci,OHCI1394_AsReqFilterHiSet, 0x80000000);
/* Specify asyncronous transfer retries */
reg_write(ohci, OHCI1394_ATRetries,
OHCI1394_MAX_AT_REQ_RETRIES |
(OHCI1394_MAX_AT_RESP_RETRIES<<4) |
(OHCI1394_MAX_PHYS_RESP_RETRIES<<8));
/* We don't want hardware swapping */
reg_write(ohci, OHCI1394_HCControlClear, OHCI1394_HCControl_noByteSwap);
/* Enable link */
reg_write(ohci, OHCI1394_HCControlSet, OHCI1394_HCControl_linkEnable);
/* If anything is connected to a port, make sure it is enabled */
num_ports = get_phy_reg(ohci, 2) & 0xf;
for (i = 0; i < num_ports; i++) {
unsigned int status;
set_phy_reg(ohci, 7, i);
status = get_phy_reg(ohci, 8);
if (status & 0x20)
set_phy_reg(ohci, 8, status & ~1);
}
}
/**
* init_ohci1394_wait_for_busresets - wait until bus resets are completed
*
* OHCI1394 initialization itself and any device going on- or offline
* and any cable issue cause a IEEE1394 bus reset. The OHCI1394 spec
* specifies that physical DMA is disabled on each bus reset and it
* has to be enabled after each bus reset when needed. We resort
* to polling here because on early boot, we have no interrupts.
*/
static inline void __init init_ohci1394_wait_for_busresets(struct ti_ohci *ohci)
{
int i, events;
for (i=0; i < 9; i++) {
mdelay(200);
events = reg_read(ohci, OHCI1394_IntEventSet);
if (events & OHCI1394_busReset)
reg_write(ohci, OHCI1394_IntEventClear,
OHCI1394_busReset);
}
}
/**
* init_ohci1394_enable_physical_dma - Enable physical DMA for remote debugging
* This enables remote DMA access over IEEE1394 from every host for the low
* 4GB of address space. DMA accesses above 4GB are not available currently.
*/
static inline void __init init_ohci1394_enable_physical_dma(struct ti_ohci *hci)
{
reg_write(hci, OHCI1394_PhyReqFilterHiSet, 0xffffffff);
reg_write(hci, OHCI1394_PhyReqFilterLoSet, 0xffffffff);
reg_write(hci, OHCI1394_PhyUpperBound, 0xffff0000);
}
/**
* init_ohci1394_reset_and_init_dma - init controller and enable DMA
* This initializes the given controller and enables physical DMA engine in it.
*/
static inline void __init init_ohci1394_reset_and_init_dma(struct ti_ohci *ohci)
{
/* Start off with a soft reset, clears everything to a sane state. */
init_ohci1394_soft_reset(ohci);
/* Accessing some registers without LPS enabled may cause lock up */
reg_write(ohci, OHCI1394_HCControlSet, OHCI1394_HCControl_LPS);
/* Disable and clear interrupts */
reg_write(ohci, OHCI1394_IntEventClear, 0xffffffff);
reg_write(ohci, OHCI1394_IntMaskClear, 0xffffffff);
mdelay(50); /* Wait 50msec to make sure we have full link enabled */
init_ohci1394_initialize(ohci);
/*
* The initialization causes at least one IEEE1394 bus reset. Enabling
* physical DMA only works *after* *all* bus resets have calmed down:
*/
init_ohci1394_wait_for_busresets(ohci);
/* We had to wait and do this now if we want to debug early problems */
init_ohci1394_enable_physical_dma(ohci);
}
/**
* init_ohci1394_controller - Map the registers of the controller and init DMA
* This maps the registers of the specified controller and initializes it
*/
static inline void __init init_ohci1394_controller(int num, int slot, int func)
{
unsigned long ohci_base;
struct ti_ohci ohci;
printk(KERN_INFO "init_ohci1394_dma: initializing OHCI-1394"
" at %02x:%02x.%x\n", num, slot, func);
ohci_base = read_pci_config(num, slot, func, PCI_BASE_ADDRESS_0+(0<<2))
& PCI_BASE_ADDRESS_MEM_MASK;
set_fixmap_nocache(FIX_OHCI1394_BASE, ohci_base);
ohci.registers = (void *)fix_to_virt(FIX_OHCI1394_BASE);
init_ohci1394_reset_and_init_dma(&ohci);
}
/**
* debug_init_ohci1394_dma - scan for OHCI1394 controllers and init DMA on them
* Scans the whole PCI space for OHCI1394 controllers and inits DMA on them
*/
void __init init_ohci1394_dma_on_all_controllers(void)
{
int num, slot, func;
if (!early_pci_allowed())
return;
/* Poor man's PCI discovery, the only thing we can do at early boot */
for (num = 0; num < 32; num++) {
for (slot = 0; slot < 32; slot++) {
for (func = 0; func < 8; func++) {
u32 class = read_pci_config(num,slot,func,
PCI_CLASS_REVISION);
if ((class == 0xffffffff))
continue; /* No device at this func */
if (class>>8 != PCI_CLASS_SERIAL_FIREWIRE_OHCI)
continue; /* Not an OHCI-1394 device */
init_ohci1394_controller(num, slot, func);
break; /* Assume one controller per device */
}
}
}
printk(KERN_INFO "init_ohci1394_dma: finished initializing OHCI DMA\n");
}
/**
* setup_init_ohci1394_early - enables early OHCI1394 DMA initialization
*/
static int __init setup_ohci1394_dma(char *opt)
{
if (!strcmp(opt, "early"))
init_ohci1394_dma_early = 1;
return 0;
}
/* passing ohci1394_dma=early on boot causes early OHCI1394 DMA initialization */
early_param("ohci1394_dma", setup_ohci1394_dma);
......@@ -104,6 +104,9 @@ enum fixed_addresses {
(__end_of_permanent_fixed_addresses & 511),
FIX_BTMAP_BEGIN = FIX_BTMAP_END + NR_FIX_BTMAPS*FIX_BTMAPS_NESTING - 1,
FIX_WP_TEST,
#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
FIX_OHCI1394_BASE,
#endif
__end_of_fixed_addresses
};
......
......@@ -44,6 +44,9 @@ enum fixed_addresses {
FIX_IO_APIC_BASE_END = FIX_IO_APIC_BASE_0 + MAX_IO_APICS-1,
FIX_EFI_IO_MAP_LAST_PAGE,
FIX_EFI_IO_MAP_FIRST_PAGE = FIX_EFI_IO_MAP_LAST_PAGE+MAX_EFI_IO_PAGES-1,
#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
FIX_OHCI1394_BASE,
#endif
__end_of_fixed_addresses
};
......
#ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
extern int __initdata init_ohci1394_dma_early;
extern void __init init_ohci1394_dma_on_all_controllers(void);
#endif
......@@ -586,5 +586,33 @@ config LATENCYTOP
Enable this option if you want to use the LatencyTOP tool
to find out which userspace is blocking on what kernel operations.
config PROVIDE_OHCI1394_DMA_INIT
bool "Provide code for enabling DMA over FireWire early on boot"
depends on PCI && X86
help
If you want to debug problems which hang or crash the kernel early
on boot and the crashing machine has a FireWire port, you can use
this feature to remotely access the memory of the crashed machine
over FireWire. This employs remote DMA as part of the OHCI1394
specification which is now the standard for FireWire controllers.
With remote DMA, you can monitor the printk buffer remotely using
firescope and access all memory below 4GB using fireproxy from gdb.
Even controlling a kernel debugger is possible using remote DMA.
Usage:
If ohci1394_dma=early is used as boot parameter, it will initialize
all OHCI1394 controllers which are found in the PCI config space.
As all changes to the FireWire bus such as enabling and disabling
devices cause a bus reset and thereby disable remote DMA for all
devices, be sure to have the cable plugged and FireWire enabled on
the debugging host before booting the debug target for debugging.
This code (~1k) is freed after boot. By then, the firewire stack
in charge of the OHCI-1394 controllers should be used instead.
See Documentation/debugging-via-ohci1394.txt for more information.
source "samples/Kconfig"
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment