Yesterday I released the first version of node-capstone, a node module that provides bindings for the Capstone disassembler library. It's the first piece of a debugger project that I recently restarted. It's a long story, but the project has been in development for about 5 years. These are its latest developments.
Hacking like a Samurai
First, I should probably establish my credentials in this area. I've written debuggers for five different architectures. Two of those were released publicly; FCEUd and GCNrd. The other three targeted Nintendo DS, Nintendo 64, and Nintendo Virtual Boy.There are a few patterns to be spotted here! Obviously, they are all for Nintendo game systems. But more importantly, the debuggers interact with and analyze assembly code. Games for the most recent of these machines (NDS, GCN, and N64) were written in C and C++, but of course we don't have the luxury of inspecting any of this source code. The next best thing is analysis of the machine code itself; transform the machine code into human-readable assembly language, and begin studying!
In the past three years, I have written a lot of JavaScript and Python, which leaves little room for poking around in assembly languages. But there are good debug environments for each; For JavaScript you have Firebug, Firefox Browser Debugger, and the Webkit Inspector; For Python there's the excellent pdb module. These tools all hold many similarities to one another; providing breakpoints for trapping code execution, inspecting and modifying variables at runtime, injecting code, but most importantly they are all source-level debuggers.
Source Level vs Machine Level
So let's talk about source level debuggers and how they differ from machine level debuggers. A source level debugger expects the original source code and debug info from the compiler that maps machine code addresses to the source code. Some of these source level debuggers (gdb, et al.) will disassemble machine code if there is no source (or if you ask it to), but these tools are not designed for static analysis. They can give you a murky picture of the machine code at best.A machine level debugger is the ultimate super king of static analysis. There is no "source of truth" to make sense of the blob of binary being analyzed. Some executable formats provide hints about code structure, like which segments contain code and which segments are only for data. Then there is the issue of determining which bytes in the code segments are instructions (this is complicated by some architectures using variable-sized instructions. x86 is one such architecture). Finally, it must determine the overall structure of the code; which groups of instructions comprise a complete function, flow control, how functions refer to one another, how they refer to data, pattern matching and heuristics to identify common library functions, etc.
With static analysis you can go even one step further, and determine how sets of instructions can be "decompiled" into a high-level language. A simple example is a load instruction followed by an arithmetic instruction that adds 5, and a store instruction back to the same address; this set could be equivalent to a variable assignment like the following in C:
x += 5;
This method of analysis is not necessary in a source level debugger. But it becomes important with fairly complex code in a machine level debugger. The above snippet of C is immediately recognizable, compared to the sample assembly below:
ldr r3, 8[r4] addi r3, r3, #5 str r3, 8[r4]
It is this sophisticated analysis that I want my debugger to do. I have a long, long way to go before it gets to that point.
The Project
I've explained enough to describe my intention; building a machine level debugger with awesome analysis tools. But I haven't really touched on internal design or even specific architecture targets or languages. And with good reason! This debugger is taking an architecture-independent approach. Meaning it should be able to do a decent job debugging anything that it can disassemble. But it can do the best job with architectures that it can analyze automatically.To achieve this abstraction, the debugger needs a solid core with all architectures supported by plugins or modules. Even the tracing of running processes will be written with an abstraction layer; a protocol I'm designing called Scalable Remote Debugger Protocol. SRDP provides a generic interface for transporting raw data and issuing primitive debugger commands like "add breakpoint". The protocol also supports a notification mechanism for event handling (like a breakpoint being triggered).
SRDP
The protocol is still in the design phase, having undergone many revisions already. The latest developments in protocol land have been separating the data blocks from the data streams. CPU memory is an example of a block of data. Things like commands and event notifications are data streams. Conceptually all three can use the same transport, but the endpoints will be a bit different:- Data block endpoints are encapsulated as a mountable block storage device, similar to a USB flash drive or SD card. This allows interacting with a machine using native OS utilities; want to read raw memory from a specific address? Open and read the file, just like you are already used to doing!
- Commands and events are encapsulated in a data stream interface using sockets, which itself uses the same open/read/write/close API as the file system.
Putting the data blocks into a mountable storage interface was an idea from qeed, who is working on something similar for his GameBoy emulator.
SRDP gives us a way to connect debugger targets to debugger UIs. A UI could provide a text mode interface like gdb, or a graphical interface like IDA Pro.
The User Interface
This project has been in development for a long time, and it has gone through plenty of revisions itself. When I initiated the project over 5 years ago, I spent a lot of time researching available GUI toolkits. The only one I found promising was the web stack! HTML is already used to layout every website in existence, CSS styles them, and JavaScript breathes life into them. This is the web stack; it is not tied to any particular operating system, and it works everywhere.As early as '10, the only real web stack that could be used outside of a browser was Mozilla's framework, called XULRunner. I've written a few samples with it [1] [2], and was impressed immediately. But at the time, the technology was not mature enough to support development of a full-featured application by just one man.
XULRunner is still alive and kicking today, in the same capacity as always; it powers Firefox and similar applications. There are three reasons I did not stick with it as my platform of choice:
- XUL. This is the name of a structural language with XML syntax that replaces HTML as the GUI layout definition format in XULRunner. You don't so much write HTML+CSS+JavaScript as you write XUL+CSS+JavaScript. It's rather unfamiliar.
- XPCOM. This technology allows JavaScript to call functions written in another language (C, C++, Java, Python, etc.) XPCOM is written in C++. Even to interface with an existing dynamically linked library, one would have to write an XPCOM component in C++ to create the bindings. This situation has changed since '10, with the introduction of js-ctypes.
- Debugging. Native debugger UIs in XULRunner were non-existent at the time. It was possible to install a version of Firebug which worked with XULRunner, but it just wasn't the same. This situation may also have been alleviated, now that Firefox Browser Debugger is available.
Due to the rough edges with XULRunner, I briefly switched gears away from JavaScript for my debugger UI, and went full-force into Python+Tkiner! While Tk certainly has its place, I'm fully convinced that it is not for developing debugger user interfaces. It does not give you full control over element styling (only a little!) and there's a grievous bug in the OSX release that will probably go unfixed for the next decade. The only thing Tkinter has going for it is that it's bundled with Python.
With Python eliminated, I'm back to JavaScript as the host for my application. Being in game development, I was made aware of node-webkit, which is very similar to XULRunner, but uses HTML everywhere instead of XUL, integrates node.js instead of XPCOM, and has the Webkit Inspector built in for debugging. Seems perfect, right?
The problem with node.js is that native modules must be written in C++ (if you can't tell, I have a major aversion to this language). But then I came across the ffi module, which can be compared to js-ctypes. Now it's perfect!
Baby Steps
I have not yet decided on how I want to render the disassembler or hex editor (two widgets that are required in any machine level debugger), but I know exactly how I'm going to implement the disassembler!
In the Python experiment, I wrote a complete MIPS disassembler directly in Python. It wasn't going to be the fastest disassembler ever, but it sure was easy to write and extend! Not wanting to rewrite it in JavaScript, I decided to look for an open source library that would suit my needs. I found Capstone. It is awesome.
Capstone is written in C, based on LLVM, and supports five architectures: x86 (including X86_64), ARM, ARM64, MIPS, and PowerPC, with two more coming in the next version (SystemZ and Sparc). These are all interesting to me (especially x86_64, which is a disassembler I do NOT want to write myself), and the library should be straight-forward to extend with other architectures that I want.
All I need now is JavaScript bindings for it!
All I need now is JavaScript bindings for it!
var capstone = require("capstone"); var code = new Buffer([ 0x55, 0x48, 0x8b, 0x05, 0xb8, 0x13, 0x00, 0x00 ]); var cs = new capstone.Cs(capstone.ARCH.X86, capstone.MODE.X64); cs.disasm(code, 0x1000).forEach(function (insn) { console.log( "0x%s:\t%s\t%s", insn.address.toString(16), insn.mnemonic, insn.op_str ); }); cs.close();
Welp! That was easy! :D
In all seriousness, I spent a weekend getting this working. Here's the actual output:
0x1000: push rbp
0x1001: mov rax, qword ptr [rip + 0x13b8]
Compare to the Python example: http://www.capstone-engine.org/lang_python.html (the close() method was unavoidable in JavaScript, because there are no destructors.)
It is written as a node module using ffi, and has been released on github under the terms of the MIT license. This will provide the "solid core" that I mentioned earlier. Hook the disassembler up with some static analysis tools, an awesome renderer/editor, and SRDP; a modular debugger foundation that runs everywhere and debugs everything!
It's still just a tool
While that sounds quite glorified and romantic, it must be said that even the one-debugger-to-rule-them-all won't make you the best hacker in the world. It takes years of practice to use such a tool effectively. And it takes many more years just to create the tool, and to refine it to perfection.
The debugger is my katana.