Go to: When we have source code · We don’t have source code: ask the OS · We don’t have source code: look at the binary · When we debug the OS · Coredump
Last updated:
Fire up a debugger:
Don’t forget the nice hexdump macros. Write python scripts for parsing buffer content (when debugging a serial protocol).
Or start tracing:
How to decide which one to use? Debugging stops the execution, printing can only slow the execution down. Is that a problem? See RTT
Race condition:
Try compiling with -fsanitize=thread. If that is not an option, try to reproduce the bug:
while :; do ./testsuite "Testcase with race condition" || break; done
Cpu hogs can speed things up:
stress -c 4
If you work in a stress-free environment:
yes
Now in GDB:
gdb --args ./testsuite "Race condition"
Run forever:
set pagination off
break exit
commands
run
end
Recover lost PC:
Set program counter to $lr (link register).
Thread 1 "testsuite_pvxs" received signal SIGSEGV, Segmentation fault.
0x0000ffffec001600 in ?? ()
(gdb) bt
#0 0x0000ffffec001600 in ?? ()
#1 0x0000ffffffffedd0 in ?? ()
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) set $pc = $lr
(gdb) up 2
#2 0x0000fffff7f554c8 in non-virtual thunk to pvxs::client::SubscriptionImpl::cancel() () from /usr/lib64/libpvxs.so.1.0
Make aliases for long arguments
Some machines doesn’t have persistent shell history by default. Try not to lose long commands.
alias gdb-my-failing='gdb --args ./testsuite "MyFailingTestcase: Do this and that" '
The Unix kernel is a traffic cop that mediates different demands for time, memory, disks, and so on. - Unix Power Tools
Ask the cops.
-e
Latency, from a general point of view, is a time delay between the cause and the effect of some physical change in the system being observed. - wiki
LD_PRELOAD
Add some prints:
dev_info(&pdev->dev, "Why are we here? \n");
now dmesg
Catching device events:
cat /proc/interrupts
udevadm info -a -n /dev/sda | less
The /sys
:
Check if systemd coredump collection is enabled.
#sw #debug