Extending DynamoRIO: The DynamoRIO API

DynamoRIO API
Instruction Representation
Client-Exported Routines
Special Features
Transparency Requirements
Examples

Introduction

DynamoRIO exports a rich Application Programming Interface (API) to the user for building a DynamoRIO client. A DynamoRIO client is a program that is coupled with DynamoRIO in order to jointly operate on a input program binary. To interact with the client, DynamoRIO exports specific hook functions. These are functions that, if supplied by a user client, are called by DynamoRIO when it is invoked on an input program. We first define the set of main API functions. DynamoRIO also exports a rich set of functions and data structures to manipulate IA-32 instructions. Their definition can be found on the various DynamoRIO header files. We provide two examples to illustrate how the DynamoRIO API is used to build a DynamoRIO client: a call instrumentation tool and a simple optimizer.

DynamoRIO API

The API provides routines for manipulating the Instr and InstrList data structures, including decoding and encoding them, which is discussed below. It also provides printing and memory allocation that won't interfere with the application (remember that DynamoRIO interrupts the application at arbitrary points, and so unless all routines in question are re-entrant, you cannot use the same routines as the application inside DynamoRIO or its client).

The API provides routines for spilling registers to DynamoRIO's own thread-private spill slots, and for saving and restoring the arithmetic flags using an instruction sequence that is much faster than pushf/popf. Also provided are simple mutex routines to make it easy to develop thread-safe clients, and routines for saving and restoring floating-point, MMX, and SSE state (which is not saved by DynamoRIO's context switches or its clean call mechanism).

The API is defined in the following header files:

dynamorio.h = includes application and statistics interfaces
dynamorio_ir.h = top level of our interface
dynamorio_ir_api.h = high-level routines
dynamorio_ir_instrlist.h = instruction list data structures
dynamorio_ir_opnd.h = operand data structures
dynamorio_ir_instr.h = instruction data structures
dynamorio_ir_opcodes.h = OP_ constants
dynamorio_ir_macros.h = instruction creation macros

Instruction Representation

Two key features of the DynamoRIO instruction representation are responsible for its efficiency, which is crucial when operating at runtime: linear control flow and an adaptive level of detail for representing instructions.

Linear Control Flow

Both basic blocks and traces are linear, so DynamoRIO's instruction sequences are all single entrance, multiple exit. This greatly simplifies analysis algorithms. A basic block or a trace is represented by a linked list of instructions called an InstrList. A component of the list, an Instr, can represent a single instruction or a group of undecoded instructions, as will be shown next.

Adaptive Level of Detail

It is costly to decode IA-32 instructions. Fortunately, DynamoRIO is often only interested in high-level information for a subset of instructions, such as just the control-flow instructions. Each instruction can be at one of five levels of detail. Can switch incrementally between levels.

Level 0: Raw Bundle

Instr holds raw bytes for group of instructions decoded only enough to determine the final instruction boundary

8d 34 01 8b 46 0c 2b 46 1c 0f b7 4e 08 c1 e1 07 3b c1 0f 8d a2 0a 00 00

Level 1: Raw Individual

Instr holds only one instruction, but just points to its raw bytes:

8d 34 01

8b 46 0c

2b 46 1c

0f b7 4e 08

c1 e1 07

3b c1

0f 8d a2 0a 00 00

Level 2: Opcode and Eflags

Instr has been decoded just enough to determine opcode and eflags effects (important when analyzing IA-32 code). Raw bytes still used for encoding. The flags effects below are only shown for reading (R) or writing (W) the six arithmetic flags (Carry, Parity, Adjust, Zero, Sign, and Overflow).

8d 34 01	lea	-
	\|
8b 46 0c	mov	-
	\|
2b 46 1c	sub	WCPAZSO
	\|
0f b7 4e 08	movzx	-
	\|
c1 e1 07	shl	WCPAZSO
	\|
3b c1	cmp	WCPAZSO
	\|
0f 8d a2 0a 00 00	jnl	RSO

Level 3: Operands

Instr contains dynamically allocated arrays of source and destination operands (because IA-32 so variable) that are filled in. Raw bytes are still valid and are used for encoding. Combines high-level information with quick encoding.

8d 34 01	lea	(%ecx,%eax,1) => %esi	-
	\|
8b 46 0c	mov	0xc(%esi) => %eax	-
	\|
2b 46 1c	sub	0x1c(%esi) %eax => %eax	WCPAZSO
	\|
0f b7 4e 08	movzx	0x8(%esi) => %ecx	-
	\|
c1 e1 07	shl	$0x07 %ecx => %ecx	WCPAZSO
	\|
3b c1	cmp	%eax %ecx	WCPAZSO
	\|
0f 8d a2 0a 00 00	jnl	$0x77f52269	RSO

Level 4: Modified Operands

Instr has been modified at the operand level, or has been created from operands, such that the raw bytes are no longer valid. Instr must be fully encoded from its operands.

	lea	(%ecx,%eax,1) => %edi	-
	\|
	mov	0xc(%edi) => %eax	-
	\|
	sub	0x1c(%edi) %eax => %eax	WCPAZSO
	\|
	movzx	0x8(%edi) => %ecx	-
	\|
c1 e1 07	shl	$0x07 %ecx => %ecx	WCPAZSO
	\|
3b c1	cmp	%eax %ecx	WCPAZSO
	\|
0f 8d a2 0a 00 00	jnl	$0x77f52269	RSO

Basic Block Example

As an example of using different levels of detail, a basic block in DynamoRIO is represented using a Level 0 Instr for all non-control-flow instructions, and a Level 3 Instr for the block-ending control-flow instruction:

8d 34 01 8b 46 0c 2b 46 1c 0f b7 4e 08 c1 e1 07 3b c1
\|
0f 8d a2 0a 00 00	jnl	$0x77f52269	RSO

API Instruction Manipulation Routines

Decoding

The table below gives the name of the decoding routine to build an instruction at each level of detail:

Level	Initial Decode	Upgrade Existing Lower-Level Instr
Level 0	decode_next_pc	(nothing lower)
Level 1	decode_raw	instr_expand
Level 2	decode_opcode	instr_decode_opcode
Level 3	decode	instr_decode
our basic blocks	decode_cti	instr_decode_cti

InstrLists Containing Level 0 Bundles

To give the client control over decoding overheads, we pass the InstrList for both basic blocks and clients as it is used internally, a mixture of Level 0 and Level 3 and 4. If the client wishes to access information about instructions, usually it should expand the list to be all at Level 1. Once at Level 1, an instruction will be auto-magically upgraded to the appropriate level (Level 2 if the opcode is required, or Level 3 if operands are required) if information is asked of it. This is only not true for the transition between Level 0 and Level 1. The reason is that that transition needs a context --- an InstrList --- to perform the instruction separation. The Level 0 to Level 1 transition must be explicitly performed. In fact,

instr_get_{opcode,src,dst,target} and instr_num_{srcs,dsts} WILL FAIL WHEN PASSED A LEVEL 0 INSTRUCTION!

This applies to routines that call these, as well -- for example, instr_is_cti calls instr_get_opcode. In general, instr_is routines require the opcode. There are a few exceptions, and they are documented as such: instr_is_exit_cti, instr_is_cti_short, and instr_is_cti_short_rewrite.

To handle Level 0 instructions, they must be expanded into a series of Level 1 instructions using either instr_expand or the expanding iterator:

instrlist_first_expanded
instr_get_next_expanded
instr_get_prev_expanded
instrlist_last_expanded

Here is a sample for loop using the expanding iterator:

    for (instr = instrlist_first_expanded(drcontext, bb);
	 instr != NULL;
	 instr = instr_get_next_expanded(drcontext, bb, instr))

Another expander is instrlist_decode_cti, which performs expansion plus decode_cti on all routines, and additionally hooks up cti's with Instr targets to each other. Once all instructions are expanded, all other Level upgrades are handled auto-magically.

To see the levels of each instruction in an InstrList, use the instrlist_disassemble routine.

Instruction Generation

Two levels of detail:

Specify opcode and all operands
Build IA-32 instruction using INSTR_CREATE_opcode macro that fills in implicit operands for you

Client-Exported Routines

A client obtains hooks into DynamoRIO by exporting routines with certain names. DynamoRIO calls those hooks at appropriate times.

Initialization and Termination

DynamoRIO provides hooks for process-wide and per-thread client initialization and termination. These hooks allow the client to perform both global and per-thread initialization and cleanup. The fork_init routine is only applicable to Linux, and is meant to be used for re-initialization of data structures and creation of new log files. It is called by the child of a fork. DynamoRIO creates a new log directory and new log files for the child, and this callback gives the client a chance to do the same (open files are shared with the parent otherwise).

void dynamorio_init();
void dynamorio_exit();
void dynamorio_fork_init();
void dynamorio_thread_init();
void dynamorio_thread_exit();

Basic Block Creation

DynamoRIO provides hooks for each time code is placed into the code cache. Through these hooks the client has the ability to inspect and transform any piece of code that is emitted into one of the code caches.

void dynamorio_basic_block(void *drcontext, app_pc tag, InstrList *bb);

DynamoRIO calls the user supplied routine dynamorio_basic_block each time a block is created. Drcontext is a pointer to the input program's machine context. It is critical for correct program execution that the input program's context remains intact. Thus, the user is not expected to inspect or modify the context and it is passed as an opaque pointer (i.e., void *). The basic block is passed to the client routine as a pointer to an InstrList (for the definition of InstrList and many other data structures for instructions manipulation see dynamorio_ir_instrlist.h). Each fragment is identified within DynamoRIO by a unique tag. The tag for a basic block fragment is its original starting address in the input program image, which is also passed to the client routine.

The block of code that is passed to the client routine is the fragment block without exit stubs. That is, a copy of the original image code block except that (1) direct unconditional branches are eliminated, (2) unconditional direct calls are "walked into", and (3) a terminating unconditional branch is added to catch the fall-through case of the branch that originally terminated the block.

DynamoRIO will perform its usual processing on the basic block after the client has finished with it. It is important that the client mark any control flow instructions that it does not want mangled by DynamoRIO as "meta-instructions". This is done using the instr_set_do_not_mangle(instr, true) API routine, or the convenience routine dr_instrlist_meta_preinsert().

Trace Creation

void dynamorio_trace(void *drcontext, app_pc tag, InstrList *trace);

DynamoRIO calls the client supplied routine dynamorio_trace each time a trace is created and just before the trace is emitted into the trace cache. The parameters drcontext and tag are defined as in dynamorio_basic_block. The trace is passed as an Instrlist and after it has been processed by DyanamoRIO. That is, all branches inside the trace have been realigned for emission into the code cache. The client sees exactly the code that will execute in the code cache. This includes transformations of indirect branches performed by DynamoRIO: saving eflags, comparing the branch target to that which follows the trace, and restoring the flags. The client should be aware of these code sequences.

Basic Block and Trace Deletion

void void dynamorio_fragment_deleted(void *drcontext, app_pc tag);

DynamoRIO calls this user supplied routine each time a fragment is deleted from the block or trace cache. Through this hook the client is always informed about any code deletion from one of the code caches. Such information is needed, if the client maintains its own data structures about emitted fragment code that must be kept consistent across fragment deletions.

Special Features

This section highlights some of the important features of the DynamoRIO client interface.

Clean Calls

To make it easy to insert code into the application instruction stream, DynamoRIO provides a clean call mechanism. This allows insertion of a call to a client routine that is completely transparent. DynamoRIO inserts code to save all the general-purpose registers and the eflags prior to the call, and even switches to the DynamoRIO stack to avoid relying on or tainting the application stack. Everything is restored after the call.

void dr_prepare_for_call(void *drcontext, InstrList *ilist, Instr *instr);
/* push args, do call here */
void dr_cleanup_after_call(void *drcontext, InstrList *ilist, Instr *where, uint sizeof_param_area);

Note that clean calls do NOT save or restore floating-point, MMX, or SSE state. For that, use these routines:

void proc_save_fpstate(byte *buf);
void proc_restore_fpstate(byte *buf);

These routines require a buffer that is 16-byte-aligned and of a certain size (512 bytes for processors with the FXSR feature, and 112 bytes for those without). Here is a sample usage:

    byte fp_raw[512 + 16];
    byte *fp_align = (byte *) ( (((uint)fp_raw) + 16) & 0xfffffff0 );
    proc_save_fpstate(fp_align);

Branch Instrumentation

DynamoRIO provides explicit support for instrumenting control transfer instructions. These convenience routines insert clean calls to client-provided methods, passing as arguments the instruction pc and target pc of each control transfer, along with taken or not taken information for conditional branches:

void dr_insert_call_instrumentation(void *drcontext, InstrList *ilist, Instr *instr, void *callee);
void dr_insert_mbr_instrumentation(void *drcontext, InstrList *ilist, Instr *instr, void *callee);
void dr_insert_cbr_instrumentation(void *drcontext, InstrList *ilist, Instr *instr, void *callee);

Adaptive Optimization

Two routines are provided to support adaptive optimization in the form of re-optimizing existing trace fragments:

InstrList * dr_decode_fragment(void *drcontext, app_pc tag);
bool dr_replace_fragment(void *drcontext, app_pc tag, InstrList *ilist);

DynamoRIO will decode and return the resulting InstrList for any trace fragment, identified by tag. The client can then optimize the trace by manipulating the InstrList. The final InstrList can be used to replace the existing fragment by calling the dr_replace_fragment routine. This routine can be called even while inside the to-be-replaced fragment (e.g., in a clean call from inside the fragment). The old trace is unlinked from the rest of the cache, and the rest of the cache is linked up to the new trace. Thus, as soon as execution leaves the old trace (which will occur soon since there can be no loops without links) it will never be executed again, and the new trace will be used in its place.

DynamoRIO does not currently support replacing a fragment from another thread than the one that owns the fragment. This will be forthcoming in a future release.

Custom Traces

DynamoRIO allows a client to build custom traces by marking its own trace heads and deciding when to end traces. If a client exports the following hook, DynamoRIO will call it before extending a trace (with tag trace_tag) with a new basic block (with tag next_tag):

int dynamorio_end_trace(void *drcontext, app_pc trace_tag, app_pc next_tag);

The client returns one of these values:

CUSTOM_TRACE_DYNAMORIO_DECIDES = use standard termination criteria
CUSTOM_TRACE_END_NOW = end trace now
CUSTOM_TRACE_CONTINUE = do not end trace

The client can also mark any basic block as a trace head using this routine:

bool dr_mark_trace_head(void *drcontext, app_pc tag);

Custom Exit Stubs

An exit cti can be given an InstrList to be prepended to the standard exit stub. There are set and get methods for this custom exit stub code:

void instr_set_exit_stub_code(Instr *instr, InstrList *stub);
InstrList *instr_exit_stub_code(Instr *instr);

When a fragment is re-decoded, e.g. when being appended to a trace or when re-decoded using dr_decode_fragment, the custom stubs are regenerated and added to the owning exit cti's.

Prefixes

Some code manipulations need to store a target address in a register and then jump there, but need the register to be restored as well. DynamoRIO provides a single-instruction prefix that is placed on all fragments (basic blocks as well as traces) that restores ecx. It is on traces for internal DynamoRIO use. To have it added to basic blocks as well, call this routine during initialization:

void dr_add_prefixes_to_basic_blocks();

To have a cti target the prefix rather than the normal entry, use these set and get routines:

bool instr_branch_targets_prefix(Instr *instr);
void instr_branch_set_prefix_target(Instr *instr, bool val);

Using DynamoRIO as a Standalone Library

DynamoRIO can be used a library for a standalone client (i.e., a client that runs on its own, rather than operating on a target program). In order to use the DynamoRIO API in such a situation, a dummy context must be created. This routine creates such a context, and initializes DynamoRIO for standalone use:

void * dr_standalone_init();

Note that this context cannot be used as the drcontext for a thread running under DynamoRIO control! It is only for standalone programs that wish to use DynamoRIO as a library of disassembly, etc. routines.

Transparency Requirements

DynamoRIO only supports transparent clients.

DynamoRIO tries to make it easy to build transparent clients. If a client is not transparent, any number of things could go wrong. If a client modifies the control flow structure of a basic block, DynamoRIO may not build traces properly with it. If a client modifies the eflags behavior of a basic block or trace, DynamoRIO might do the wrong thing in the fragment's eflags restoration prefix.

There is a notion of a "meta" instruction, in our IR it's a !instr_ok_to_mangle instruction. You can add your own control flow, even a call to your own routine, and we won't touch it if it's marked do_not_mangle, and it won't screw us up since we can treat it as native non-control-flow instructions. Use clean calls, save and restore state you modify, etc.

There is an unresolved issue of translating contexts for exception/signal handlers. For future releases we plan to provide an interface whereby we can ask the client how to translate code it has manipulated to obtain an original context to show the application's handler.

Examples

The sample programs below, along with a Makefile, can be found in the samples subdirectory of the doc directory in your DynamoRIO installation. The exported DynamoRIO API functions are shown in red in the following code fragments.

Example 1: Instruction Counting

We now illustrate how to use the above API to implement a simple instrumentation client for counting the number of executed call and return instructions in the input program. Full code for this example is in the file samples/countcalls.c.

The client maintains global three counters: num_direct_calls, num_indirect_calls, and num_returns to count three different types of instructions during execution. It also uses a mutex to serialize access to these global counters. The client initializes everything by supplying the following dynamorio_init routine:

EXPORT void dynamorio_init()
{
    num_direct_calls = 0;
    num_indirect_calls = 0;
    num_returns = 0;
    dr_mutex_init(&mutex);
}

The client also provides a dynamorio_exit routine that displays the final values of these counters.

To properly handle threads, the client keeps track of each thread's instruction counts separately. To do this, it creates a data structure that will be separately allocated for each thread:

typedef struct {
    int OF_slot; /* used for saving overflow flag */
    int num_direct_calls;
    int num_indirect_calls;
    int num_returns;
} per_thread;

Now the thread hooks are used to initialize the data structure and to add the results to the global counters when the thread exits:

EXPORT void dynamorio_thread_init(void *drcontext)
{
    /* create an instance of our data structure for this thread */
    per_thread *data = (per_thread *)
        dr_thread_alloc(drcontext, sizeof(per_thread));
    /* store it in the slot provided in the drcontext */
    dr_set_drcontext_field(drcontext, data);
    data->num_direct_calls = 0;
    data->num_indirect_calls = 0;
    data->num_returns = 0;
}

EXPORT void dynamorio_thread_exit(void *drcontext)
{
    per_thread *data = (per_thread *) dr_get_drcontext_field(drcontext);
    /* add thread's counters to global ones, inside our lock */
    dr_mutex_lock(&mutex);
    num_direct_calls += data->num_direct_calls;
    num_indirect_calls += data->num_indirect_calls;
    num_returns += data->num_returns;
    dr_mutex_unlock(&mutex);
    /* clean up memory */
    dr_thread_free(drcontext, data, sizeof(per_thread));
}

The real work is done in the basic block hook. We simply look for the instructions we're interested in and insert an increment of the appropriate thread-local counter, remembering to save the flags, of course.

EXPORT void dynamorio_basic_block(void *drcontext, app_pc tag, InstrList *bb)
{
    Instr *instr, *next_instr;
    per_thread *data = (per_thread *) dr_get_drcontext_field(drcontext);
    /* only interested in calls & returns, so can use DynamoRIO instrlist
     * as is, do not need to expand it!
     */
    for (instr = instrlist_first(bb); instr != NULL; instr = next_instr) {
	/* grab next now so we don't go over instructions we insert */
	next_instr = instr_get_next(instr);
	/* we can rely on all ctis being decoded, so skip undecoded instrs
	 * this will also avoid problems w/ asking for opcode of Level 0 instrs */
	if (!instr_opcode_valid(instr))
	    continue;
	/* instrument calls and returns -- ignore far calls/rets */
	if (instr_is_call_direct(instr)) {
	    /* since the inc instruction clobbers 5 of the arith flags,
	     * we have to save them around the inc.
	     * we could be more efficient by not bothering to save the
	     * overflow flag and constructing our own sequence of instructions
	     * to save the other 5 flags (using lahf).
	     */
	    dr_save_arith_flags(drcontext, bb, instr, &(data->OF_slot));
	    instrlist_preinsert(bb, instr, INSTR_CREATE_inc(drcontext,
		OPND_CREATE_MEM32(REG_NULL, (int)&(data->num_direct_calls))));
	    dr_restore_arith_flags(drcontext, bb, instr, &(data->OF_slot));
	} else if (instr_is_call_indirect(instr)) {
	    dr_save_arith_flags(drcontext, bb, instr, &(data->OF_slot));
	    instrlist_preinsert(bb, instr, INSTR_CREATE_inc(drcontext,
		OPND_CREATE_MEM32(REG_NULL, (int)&(data->num_indirect_calls))));
	    dr_restore_arith_flags(drcontext, bb, instr, &(data->OF_slot));
	} else if (instr_is_return(instr)) {
	    dr_save_arith_flags(drcontext, bb, instr, &(data->OF_slot));
	    instrlist_preinsert(bb, instr, INSTR_CREATE_inc(drcontext,
		OPND_CREATE_MEM32(REG_NULL, (int)&(data->num_returns))));
	    dr_restore_arith_flags(drcontext, bb, instr, &(data->OF_slot));
	}
    }
}

Building the Example

To build the client instrumentation application, the client program file (say instrcalls.c) should include the following:

#include "dynamorio.h"
#ifdef LINUX
#  define EXPORT
#else
#  define EXPORT __declspec(dllexport)
#endif

We can then build the client shared library in Windows:

> cl instrcalls.c /I$DYNAMORIO_HOME/include /link /libpath:$DYNAMORIO_HOME/bin dynamorio.lib /dll /out:instrcalls.dll

or in Linux:

> gcc -shared -nostartfiles -DLINUX -I$DYNAMORIO_HOME/include instrcalls.c -o instrcalls.so

The result is a shared library instrcalls.dll. To invoke the library, we pass "-instrlibname instrcalls.dll" as a DYNAMORIO_OPTIONS parameter when running an input program under DynamoRIO control.

Example 2: Instruction Profiling

The next example shows how to use the provided control flow instrumentation routines, which allow more sophisticated profiling than simply counting instructions. Full code for this example is in the file samples/instrcalls.c.

As in the previous example, the client is interested in direct and indirect calls and returns. The client wants to analyze the target address of each dynamic instance of a call or return. For our example, we simply dump the data in text format to a separate file for each thread. Since FILE cannot be exported from a DLL on Windows, we use the DynamoRIO-provided File type that hides the distinction between FILE and HANDLE to allow the same code to work on Linux and Windows. We make used of the thread initialization and exit routines to open and close the file. We store the file for a thread in the user slot in the drcontext.

EXPORT void dynamorio_thread_init(void *drcontext)
{
    /* we're going to dump our data to a per-thread file */
    char fname[512];
    File f;
    sprintf(fname, "instrcalls-%03d.log", dr_get_thread_id(drcontext));
    f = dr_open_file(fname, false/*write*/);
    assert(f != INVALID_File);
    /* store it in the slot provided in the drcontext */
    dr_set_drcontext_field(drcontext, (void *)f);
    dr_log(drcontext, LOG_ALL, 1, "instrcalls: log for thread %d is %s\n",
	   dr_get_thread_id(drcontext), fname);
}

EXPORT void dynamorio_thread_exit(void *drcontext)
{
    File f = (File) dr_get_drcontext_field(drcontext);
    dr_close_file(f);
}

The basic block hook inserts a call to a procedure for each type of instruction, using the API-provided dr_insert_call_instrumentation and dr_insert_mbr_instrumentation routines, which insert calls to procedures with a certain signature.

EXPORT void dynamorio_basic_block(void *drcontext, app_pc tag, InstrList *bb) {
    Instr *instr;
    /* only interested in calls & returns, so can use DynamoRIO instrlist
     * as is, do not need to expand it!
     */
    for (instr = instrlist_first(bb); instr != NULL; instr = instr_get_next(instr)) {
	/* we can rely on all ctis being decoded, so skip undecoded instrs
	 * this will also avoid problems w/ asking for opcode of Level 0 instrs */
	if (!instr_opcode_valid(instr))
	    continue;
        if (is_call_direct(instr)) {
            dr_insert_call_instrumentation(drcontext, bb, instr, (app_pc)at_dir_call);
        } else if (is_call_indirect(instr)) {
            dr_insert_mbr_instrumentation(drcontext, bb, instr, (app_pc)at_ind_call);
        } else if (is_return(instr)) {
            dr_insert_mbr_instrumentation(drcontext, bb, instr, (app_pc)at_return);
        }
    }
}

These procedures look like this:

static void at_call(app_pc instr_addr, app_pc target_addr)
{
    File f = (File) dr_get_drcontext_field(dr_get_current_drcontext());
    dr_fprintf(f, "CALL @ 0x%08x to 0x%08x\n", instr_addr, target_addr);
}

static void at_call_ind(app_pc instr_addr, app_pc target_addr)
{
    File f = (File) dr_get_drcontext_field(dr_get_current_drcontext());
    dr_fprintf(f, "CALL INDIRECT @ 0x%08x to 0x%08x\n", instr_addr, target_addr);
}

static void at_return(app_pc instr_addr, app_pc target_addr)
{
    File f = (File) dr_get_drcontext_field(dr_get_current_drcontext());
    dr_fprintf(f, "RETURN @ 0x%08x to 0x%08x\n", instr_addr, target_addr);
}

The address of the instruction and the address of its target are both provided. These routines could perform some sort of analysis based on these addresses. In our example we simply print out the data.

Example 3: Optimization

For the next example we consider a client application for a simple optimization. The optimizer replaces every increment/decrement operation with a corresponding add/subtract operation if running on a Pentium 4, where the add/subtract is less expensive. For optimizations, we are less concerned with covering all the code that is executed; on the contrary, in order to amortize the optimization overhead, we only want to apply the optimization to hot code. Thus, we apply the optimization at the trace level rather than the basic block level. Full code for this example is in the file samples/inc2add.c.

Example 4: Custom Tracing

This example demonstrates the custom tracing interface. It changes DynamoRIO's tracing behavior to favor making traces that start at a call and end right after a return. It demonstrates the use of both custom trace api elements :

int dynamorio_end_trace(void *drcontext, app_pc trace_tag, app_pc next_tag);
bool dr_mark_trace_head(void *drcontext, app_pc tag);

Full code for this example is in the file samples/inline.c.

Example 5: Use of Floating Point Operation in a Client

Because saving the floating point state is very expensive, DynamoRIO seeks to do so on an as needed basis. If a client wishes to use floating point operations it must save and restore the applications floating point state around the usage. This is done through :

void proc_save_fpstate(byte *buf);
void proc_restore_fpstate(byte *buf);

Note that there are restrictions on how these methods may be called, see the documentation in the header files for additional information. Note also that the floating point state must be saved around calls to our provided printing routines when they are used to print floats. However, it is not neccesary to save and restore the floating point state around floating point operations if they are being used in the void dynamorio_init() or void dynamorio_exit() functions. This example client counts the number of basic blocks processed and keeps statistics on their average size using floating point operations. Full code for this example is in the file samples/bbsize.c.

Example 6: Use of Custom Client Statistics with the Windows GUI

The new Windows GUI will display custom client statistics, if they are placed in shared memory with a certain name. The sample samples/customstats.c gives code for the protocol used in the form of a sample client that counts total instructions, floating-point instructions, and system calls.

Documentation Home

Extending DynamoRIO: The DynamoRIO API

Contents

Introduction

DynamoRIO API

Instruction Representation

Linear Control Flow

Adaptive Level of Detail

Level 0: Raw Bundle

Level 1: Raw Individual

Level 2: Opcode and Eflags

Level 3: Operands

Level 4: Modified Operands

Basic Block Example

API Instruction Manipulation Routines

Decoding

InstrLists Containing Level 0 Bundles

Instruction Generation

Client-Exported Routines

Initialization and Termination

Basic Block Creation

Trace Creation

Basic Block and Trace Deletion

Special Features

Clean Calls

Branch Instrumentation

Adaptive Optimization

Custom Traces

Custom Exit Stubs

Prefixes

Using DynamoRIO as a Standalone Library

Transparency Requirements

Examples

Example 1: Instruction Counting

Building the Example

Example 2: Instruction Profiling

Example 3: Optimization

Example 4: Custom Tracing

Example 5: Use of Floating Point Operation in a Client

Example 6: Use of Custom Client Statistics with the Windows GUI