kgdbts: (2 of 2) fix single step awareness to work correctly with SMP

The do_fork and sys_open tests have never worked properly on anything other than a UP configuration with the kgdb test suite. This is because the test suite did not fully implement the behavior of a real debugger. A real debugger tracks the state of what thread it asked to single step and can correctly continue other threads of execution or conditionally stop while waiting for the original thread single step request to return. Below is a simple method to cause a fatal kernel oops with the kgdb test suite on a 2 processor ARM system: while [ 1 ] ; do ls > /dev/null 2> /dev/null; done& while [ 1 ] ; do ls > /dev/null 2> /dev/null; done& echo V1I1F100 > /sys/module/kgdbts/parameters/kgdbts Very soon after starting the test the kernel will start warning with messages like: kgdbts: BP mismatch c002487c expected c0024878 ------------[ cut here ]------------ WARNING: at drivers/misc/kgdbts.c:317 check_and_rewind_pc+0x9c/0xc4() [<c01f6520>] (check_and_rewind_pc+0x9c/0xc4) [<c01f595c>] (validate_simple_test+0x3c/0xc4) [<c01f60d4>] (run_simple_test+0x1e8/0x274) The kernel will eventually recovers, but the test suite has completely failed to test anything useful. This patch implements behavior similar to a real debugger that does not rely on hardware single stepping by using only software planted breakpoints. In order to mimic a real debugger, the kgdb test suite now tracks the most recent thread that was continued (cont_thread_id), with the intent to single step just this thread. When the response to the single step request stops in a different thread that hit the original break point that thread will now get continued, while the debugger waits for the thread with the single step pending. Here is a high level description of the sequence of events. cont_instead_of_sstep = 0; 1) set breakpoint at do_fork 2) continue 3) Save the thread id where we stop to cont_thread_id 4) Remove breakpoint at do_fork 5) Reset the PC if needed depending on kernel exception type 6) soft single step 7) Check where we stopped if current thread != cont_thread_id { if (here for more than 2 times for the same thead) { ### must be a really busy system, start test again ### goto step 1 } goto step 5 } else { cont_instead_of_sstep = 0; } 8) clean up and run test again if needed 9) Clear out any threads that were waiting on a break point at the point in time the test is ended with get_cont_catch(). This happens sometimes because breakpoints are used in place of single stepping and some threads could have been in the debugger exception handling queue because breakpoints were hit concurrently on different CPUs. This also means we wait at least one second before unplumbing the debugger connection at the very end, so as respond to any debug threads waiting to be serviced. Cc: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jason Wessel <jason.wessel@windriver.com>

kgdbts: (2 of 2) fix single step awareness to work correctly with SMP
The do_fork and sys_open tests have never worked properly on anything other than a UP configuration with the kgdb test suite. This is because the test suite did not fully implement the behavior of a real debugger. A real debugger tracks the state of what thread it asked to single step and can correctly continue other threads of execution or conditionally stop while waiting for the original thread single step request to return. Below is a simple method to cause a fatal kernel oops with the kgdb test suite on a 2 processor ARM system: while [ 1 ] ; do ls > /dev/null 2> /dev/null; done& while [ 1 ] ; do ls > /dev/null 2> /dev/null; done& echo V1I1F100 > /sys/module/kgdbts/parameters/kgdbts Very soon after starting the test the kernel will start warning with messages like: kgdbts: BP mismatch c002487c expected c0024878 ------------[ cut here ]------------ WARNING: at drivers/misc/kgdbts.c:317 check_and_rewind_pc+0x9c/0xc4() [<c01f6520>] (check_and_rewind_pc+0x9c/0xc4) [<c01f595c>] (validate_simple_test+0x3c/0xc4) [<c01f60d4>] (run_simple_test+0x1e8/0x274) The kernel will eventually recovers, but the test suite has completely failed to test anything useful. This patch implements behavior similar to a real debugger that does not rely on hardware single stepping by using only software planted breakpoints. In order to mimic a real debugger, the kgdb test suite now tracks the most recent thread that was continued (cont_thread_id), with the intent to single step just this thread. When the response to the single step request stops in a different thread that hit the original break point that thread will now get continued, while the debugger waits for the thread with the single step pending. Here is a high level description of the sequence of events. cont_instead_of_sstep = 0; 1) set breakpoint at do_fork 2) continue 3) Save the thread id where we stop to cont_thread_id 4) Remove breakpoint at do_fork 5) Reset the PC if needed depending on kernel exception type 6) soft single step 7) Check where we stopped if current thread != cont_thread_id { if (here for more than 2 times for the same thead) { ### must be a really busy system, start test again ### goto step 1 } goto step 5 } else { cont_instead_of_sstep = 0; } 8) clean up and run test again if needed 9) Clear out any threads that were waiting on a break point at the point in time the test is ended with get_cont_catch(). This happens sometimes because breakpoints are used in place of single stepping and some threads could have been in the debugger exception handling queue because breakpoints were hit concurrently on different CPUs. This also means we wait at least one second before unplumbing the debugger connection at the very end, so as respond to any debug threads waiting to be serviced. Cc: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
23bbd8e3 · Jason Wessel · 486c5987 · 23bbd8e3
Commit 23bbd8e3 authored Mar 29, 2012 by Jason Wessel
Hide whitespace changes
Inline Side-by-side

Showing with 62 additions and 11 deletions

drivers/misc/kgdbts.c drivers/misc/kgdbts.c +62 -11

No files found.
--- a/drivers/misc/kgdbts.c
+++ b/drivers/misc/kgdbts.c
@@ -142,7 +142,9 @@ static int arch_needs_sstep_emulation = 1;
 #else
 static int arch_needs_sstep_emulation;
 #endif
+static unsigned long cont_addr;
 static unsigned long sstep_addr;
+static int restart_from_top_after_write;
 static int sstep_state;

 /* Storage for the registers, in GDB format. */
@@ -190,7 +192,8 @@ static int kgdbts_unreg_thread(void *ptr)
 	 */
 	while (!final_ack)
 		msleep_interruptible(1500);
-
+	/* Pause for any other threads to exit after final ack. */
+	msleep_interruptible(1000);
 	if (configured)
 		kgdb_unregister_io_module(&kgdbts_io_ops);
 	configured = 0;
@@ -312,13 +315,21 @@ static int check_and_rewind_pc(char *put_str, char *arg)
 	if (addr + BREAK_INSTR_SIZE == ip)
 		offset = -BREAK_INSTR_SIZE;
 #endif
-	if (strcmp(arg, "silent") && ip + offset != addr) {
+
+	if (arch_needs_sstep_emulation && sstep_addr &&
+	    ip + offset == sstep_addr &&
+	    ((!strcmp(arg, "sys_open") || !strcmp(arg, "do_fork")))) {
+		/* This is special case for emulated single step */
+		v2printk("Emul: rewind hit single step bp\n");
+		restart_from_top_after_write = 1;
+	} else if (strcmp(arg, "silent") && ip + offset != addr) {
 		eprintk("kgdbts: BP mismatch %lx expected %lx\n",
 			   ip + offset, addr);
 		return 1;
 	}
 	/* Readjust the instruction pointer if needed */
 	ip += offset;
+	cont_addr = ip;
 #ifdef GDB_ADJUSTS_BREAK_OFFSET
 	instruction_pointer_set(&kgdbts_regs, ip);
 #endif
@@ -328,6 +339,8 @@ static int check_and_rewind_pc(char *put_str, char *arg)
 static int check_single_step(char *put_str, char *arg)
 {
 	unsigned long addr = lookup_addr(arg);
+	static int matched_id;
+
 	/*
 	 * From an arch indepent point of view the instruction pointer
 	 * should be on a different instruction
@@ -338,17 +351,28 @@ static int check_single_step(char *put_str, char *arg)
 	v2printk("Singlestep stopped at IP: %lx\n",
 		   instruction_pointer(&kgdbts_regs));

-	if (sstep_thread_id != cont_thread_id && !arch_needs_sstep_emulation) {
+	if (sstep_thread_id != cont_thread_id) {
 		/*
 		 * Ensure we stopped in the same thread id as before, else the
 		 * debugger should continue until the original thread that was
 		 * single stepped is scheduled again, emulating gdb's behavior.
 		 */
 		v2printk("ThrID does not match: %lx\n", cont_thread_id);
+		if (arch_needs_sstep_emulation) {
+			if (matched_id &&
+			    instruction_pointer(&kgdbts_regs) != addr)
+				goto continue_test;
+			matched_id++;
+			ts.idx -= 2;
+			sstep_state = 0;
+			return 0;
+		}
 		cont_instead_of_sstep = 1;
 		ts.idx -= 4;
 		return 0;
 	}
+continue_test:
+	matched_id = 0;
 	if (instruction_pointer(&kgdbts_regs) == addr) {
 		eprintk("kgdbts: SingleStep failed at %lx\n",
 			   instruction_pointer(&kgdbts_regs));
@@ -390,6 +414,31 @@ static int got_break(char *put_str, char *arg)
 	return 1;
 }

+static void get_cont_catch(char *arg)
+{
+	/* Always send detach because the test is completed at this point */
+	fill_get_buf("D");
+}
+
+static int put_cont_catch(char *put_str, char *arg)
+{
+	/* This is at the end of the test and we catch any and all input */
+	v2printk("kgdbts: cleanup task: %lx\n", sstep_thread_id);
+	ts.idx--;
+	return 0;
+}
+
+static int emul_reset(char *put_str, char *arg)
+{
+	if (strncmp(put_str, "$OK", 3))
+		return 1;
+	if (restart_from_top_after_write) {
+		restart_from_top_after_write = 0;
+		ts.idx = -1;
+	}
+	return 0;
+}
+
 static void emul_sstep_get(char *arg)
 {
 	if (!arch_needs_sstep_emulation) {
@@ -443,8 +492,7 @@ static int emul_sstep_put(char *put_str, char *arg)
 		v2printk("Stopped at IP: %lx\n",
 			 instruction_pointer(&kgdbts_regs));
 		/* Want to stop at IP + break instruction size by default */
-		sstep_addr = instruction_pointer(&kgdbts_regs) +
-			BREAK_INSTR_SIZE;
+		sstep_addr = cont_addr + BREAK_INSTR_SIZE;
 		break;
 	case 2:
 		if (strncmp(put_str, "$OK", 3)) {
@@ -456,6 +504,9 @@ static int emul_sstep_put(char *put_str, char *arg)
 		if (strncmp(put_str, "$T0", 3)) {
 			eprintk("kgdbts: failed continue sstep\n");
 			return 1;
+		} else {
+			char *ptr = &put_str[11];
+			kgdb_hex2long(&ptr, &sstep_thread_id);
 		}
 		break;
 	case 4:
@@ -558,13 +609,13 @@ static struct test_struct do_fork_test[] = {
 	{ "c", "T0*", NULL, get_thread_id_continue }, /* Continue */
 	{ "do_fork", "OK", sw_rem_break }, /*remove breakpoint */
 	{ "g", "do_fork", NULL, check_and_rewind_pc }, /* check location */
-	{ "write", "OK", write_regs }, /* Write registers */
+	{ "write", "OK", write_regs, emul_reset }, /* Write registers */
 	{ "s", "T0*", emul_sstep_get, emul_sstep_put }, /* Single step */
 	{ "g", "do_fork", NULL, check_single_step },
 	{ "do_fork", "OK", sw_break, }, /* set sw breakpoint */
 	{ "7", "T0*", skip_back_repeat_test }, /* Loop based on repeat_test */
 	{ "D", "OK", NULL, final_ack_set }, /* detach and unregister I/O */
-	{ "", "" },
+	{ "", "", get_cont_catch, put_cont_catch },
 };

 /* Test for hitting a breakpoint at sys_open for what ever the number
@@ -576,13 +627,13 @@ static struct test_struct sys_open_test[] = {
 	{ "c", "T0*", NULL, get_thread_id_continue }, /* Continue */
 	{ "sys_open", "OK", sw_rem_break }, /*remove breakpoint */
 	{ "g", "sys_open", NULL, check_and_rewind_pc }, /* check location */
-	{ "write", "OK", write_regs }, /* Write registers */
+	{ "write", "OK", write_regs, emul_reset }, /* Write registers */
 	{ "s", "T0*", emul_sstep_get, emul_sstep_put }, /* Single step */
 	{ "g", "sys_open", NULL, check_single_step },
 	{ "sys_open", "OK", sw_break, }, /* set sw breakpoint */
 	{ "7", "T0*", skip_back_repeat_test }, /* Loop based on repeat_test */
 	{ "D", "OK", NULL, final_ack_set }, /* detach and unregister I/O */
-	{ "", "" },
+	{ "", "", get_cont_catch, put_cont_catch },
 };

 /*
@@ -725,8 +776,8 @@ static int run_simple_test(int is_get_char, int chr)
 	/* This callback is a put char which is when kgdb sends data to
 	 * this I/O module.
 	 */
-	if (ts.tst[ts.idx].get[0] == '\0' &&
-		ts.tst[ts.idx].put[0] == '\0') {
+	if (ts.tst[ts.idx].get[0] == '\0' && ts.tst[ts.idx].put[0] == '\0' &&
+	    !ts.tst[ts.idx].get_handler) {
 		eprintk("kgdbts: ERROR: beyond end of test on"
 			   " '%s' line %i\n", ts.name, ts.idx);
 		return 0;