INTRODUCTION TO FIR

Forge performs bounded verification on programs written in the Forge Intermediate Representation (FIR), a simple, imperative and relational programming language. Applying Forge to a conventional high-level language, such as Java, requires a translation from that language to FIR. The Forge framework includes a Java-to-FIR translation that will soon be made available for download, and a translation from C is currently being explored.

When we say FIR is "relational", we mean that every expression evaluates to a relation, i.e. a set of tuples, where each tuple is a sequence of atoms. The arity of a relation (the length of its tuples) can be any positive integer. A set of atoms can be represented by a unary relation (relation of arity 1), and a scalar by a singleton set. The word "imperative" distinguishes FIR from declarative languages like Alloy and Prolog. Although declarative statements can be embedded in FIR code, its structure is that of a standard procedural language with control-flow and procedure invocation.

In designing FIR, our objective was to develop a representation, based on relational logic, that could express both programs and their specifications, and that allows specifications to be embedded as statements in the code. To make the representation amenable to analysis, we made simplicity and uniformity of the language a key goal.

Our desire for FIR to be based on relational logic stems from our experience with the Alloy modeling language and the Kodkod model finder. From our work with Alloy, we understood the power of first-order logic and relational calculus to specify systems and encode object-oriented programs in a concise and straightforward way. We were familiar, too, with the strength of the Kodkod at efficiently finding models to formulas in such a logic.

By allowing specifications to be embedded as statements in code, FIR would conveniently support modular analysis. That is, once a procedure is deemed to meet a specification, calls to that procedure could be replaced with instantiations of the specification. Because the specification is usually more compact and sometimes an abstraction of the code, using it in place of the implementation tends to greatly improve performance. It also allows the implementation to vary, so long as the new implementation still satisfies the specification.

A FIR PROGRAM

The grammar for a FIR program is given in Figure 1. As shown, a program consists of declarations of domains, variables, literals, and procedures. A domain is a set of atoms that is disjoint from all other domains. In a conventional object-oriented language, a domain might be called a "class"; however, we have avoided that term because unlike traditional classes, domains cannot be structured hierarchically.

Two domains are built into the language: the domain of boolean values and the domain of integers. The other domains -- called instance domains -- are created by the user when constructing a FIR program. They are called "instance" domains, because they are the only domains from which new atoms can be instantiated. As of this writing, there is no native support for a real number domain in FIR.

The type of a FIR expression is either a domain or some combination obtained by the union or cross product of domains. Consider a program with instance domains Animal and Plant, for example, and an expression that maps each animal or plant to its age in years. That expression would have an arity of two and the type (Animal ∪ Plant) → Integer. Naturally, the arity of an expression always matches the arity of its type.

Variables are declared with an identifier and a type, and as either global or local. The values of global variables are visible to all procedures in the program across all invocations, whereas the values of local variables are confined to a given procedure invocation. Although there is one set of local variables common to the entire program, semantically, each procedure invocation has access only to its own unique copy of those variables.

Literals are constant values that evaluate to singleton sets, and the value of each literal is disjoint from every other literal. There are built-in literals for the boolean constants true and false and for each integer. A client may declare any number of literals belonging to the declared instance domains. For example, to express an enumerated type for playing card suits, a client could declare an instance domain named ``CardSuit'' and four literals in the domain named "spades", "hearts", "diamonds", and "clubs".

To simplify automatic analysis, each expression in FIR is side-effect free. The program elements described above -- domains, variables, and literals -- are the leaf expressions in the language. When treated as an expression, a domain evaluates to its extent, i.e. the set of atoms that have so far been instantiated from the domain. Composite expressions can be built from standard set operators, including union (∪), intersection (∩), and difference (\), as well as the relational operators join (.), cross product (→), override (⊕), projection (π), and transitive closure (^). Boolean, arithmetic, bitwise operators, set comprehension, and universal and existential quantification are supported as well. The grammar for FIR expression is given in Figure 2

A FIR PROCEDURE

A FIR procedure consists of an identifier that names the procedure, any number of input parameters, any number of output parameters, and a control-flow graph of program statements. There are no "return" statements in FIR code; instead, the final value of an output parameter is visible to a procedure's caller, and because a procedure can have more than one output parameter, the language supports multiple return values.

There are three kinds of statements: branch, update, and exit. Each branch statement has exactly two successors in the control-flow graph; update statements have exactly one successor; and exit statements have no successors. The same statement may be a successor to any number of statements, so join points and loops are certainly allowed.

Update statements are the only statements that may modify the program state. There are four kinds of update statements: assign, create, call, and spec. Assign statements assign the value of an expression to a variable, global or local. A create statement generates a new instance from an instance domain and assigns it to a variable. A call statement invokes a procedure.

Specification statements, as in other verification languages, state boolean conditions that are presumed to be true. In FIR, however, assume statements may also have side effects. A spec statement first lists the variables that may be modified, and then a formula that relates the old and new values of the modified variables. An old expression can be used in the condition to refer to the value of a variable before the statement. Thus, the following statement may only modify the integer variable x variable and only in such a way that its value after the statement is strictly greater than its value before the statement:

x = spec(x > old(x))

Spec statements provide a way of embedding specifications as ordinary statements in the code, an idea borrowed from the refinement calculus. A spec statement without side-effects, often called an "assume" statement can be constructed simply by leaving the list of modified variables empty.

FIGURE 1: FIR PROGRAM GRAMMAR

Program	::=	Domain* Variable* Literal* Procedure*

Domain	::=	boolean \| integer \| InstanceDomain
Type	::=	∅ \| Domain \| Type ∪ Type \| Type → Type
InstanceDomain	::=	domainId

Variable	::=	LocalVariable \| GlobalVariable
LocalVariable	::=	localId : Type
GlobalVariable	::=	globalId : Type
VariableId	::=	localId \| globalId

Literal	::=	BooleanLiteral \| IntegerLiteral \| InstanceLiteral
BooleanLiteral	::=	true \| false
IntegerLiteral	::=	0 \| -1 \| 1 \| -2 \| 2 \| . . .
InstanceLiteral	::=	literalId : domainId

Procedure	::=	procedureId(localId) returns (localId) Stmt
Stmt	::=	BranchStmt \| UpdateStmt;Stmt \| ExitStmt
BranchStmt	::=	if Expr then Stmt else Stmt
UpdateStmt	::=	AssignStmt \| CreateStmt \| CallStmt \| SpecStmt

AssignStmt	::=	variableId = Expr
CreateStmt	::=	variableId = new domainId
CallStmt	::=	variableId* = procedureId(Expr*)
SpecStmt	::=	variableId* = spec(Expr)

FIGURE 2: FIR EXPRESSION GRAMMAR

Expr ::=
	domainId \| variableId \| literalId	leaf
\|	Expr {∪ \| ∩ \| \ } Expr	set operators
\|	Expr . Expr	join
\|	Expr → Expr	cross product
\|	Expr ⊕ Expr	override
\|	π_{IntegerLiteral*}(Expr)	projection
\|	^Expr	transitive closure
\|	{VariableId* . Expr}	set comprehension
\|	Expr in Expr	subset
\|	Expr = Expr	equality
\|	Expr != Expr	inequality
\|	not Expr	negation
\|	Expr {and \| or} Expr	boolean operators
\|	Expr ? Expr : Expr	conditional
\|	{∀ \| ∃ \| ∑} Expr	quantification
\|	Expr {+ \| - \| × \| ÷ \| mod} Expr	arithmetic
\|	Expr {> \| < \| ≥ \| ≤} Expr	integer inequality
\|	Expr { \| \| & \| ^ } Expr	bitwise operators
\|	Expr { << \| >> \| >>> } Expr	bit shifting
\|	# Expr	cardinality
\|	old(Expr)	old expression

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.