MASSACHVSETTS INSTITVTE OF TECHNOLOGY
Spring Semester, 1997
Problem Set 9
Issued: Thursday April 17, 1997
Written solutions due: Wednesday, April 30 for sections meeting at 10 and 11; Friday, May 2 for sections meeting at 9, 12, 1 and 2.
Reading: Attached code (java-eval.scm, and java-support.scm). Note also that java.scm, and java-syntax.scm can be read online in the 6.001 Web page, if you are interested.
Note: The overall programming framework will be discussed in lecture on Thursday, April 17, 1997.
This is a brand new problem set and may very well be buggy. We will take this into account when grading. Please report problems to the lab TAs so that they can be fixed over the summer.
All of the code, as well as the HTML for this handout, is at
http://www-swiss.ai.mit.edu/~u6001/ST97/JMiller/ps9-java/problem-set
Why are we doing this, i.e. what's the point of writing an interpreter for a language when there are perfectly good commercial compilers and development environments available? In order to understand any complicated system it's often worthwhile to build and study a small model. The Decaf Java-In-Scheme interpreter is the computer equivalent of a scale model of a jet airplane, and can serve the same purposes:
In designing Decaf Java-In-Scheme, our main goal was to make the interpreter small enough that it can be understood by someone familiar with Scheme interpreters with about a week of study. That's a strong constraint, and in order to meet it, we've done a good deal of simplifying. Here are some of the major decisions that you will need to understand before examining the code:
The best way to understand Decaf Java-In-Scheme's syntax is to compare it with Java directly. Just as in Java (and, indeed, most modern programming languages), there are three major kinds of constructs: expressions which can be used to compute values; statements which can be executed for side-effects (but which don't compute values); and declarations which are used to define new names (in Java, declarations define classes and interfaces). Notice that any expression can be used where a statement is expected (but not the reverse).
Decaf Java-In-Scheme Expressions
Let's begin by exploring the expressions of the language. The procedure expr? tells us what are legal expressions, its body is:
(or (literal? obj)
(java-name? obj)
(new? obj)
(dot? obj)
(call? obj)
(cast? obj)
(instanceof? obj)
(built-in-expr? obj)
(assignment? obj))
and thus each of these predicates defines a different legal
expression.
We can build a chart that compares Java expressions and the corresponding Decaf Java representations, which we will examine in detail below:
| JAVA | DECAF JAVA-IN-SCHEME |
| 3 | 3 |
| foo | foo |
| new foo() | (new foo) |
| this.F | (dot this f) |
| F(3) | (call f 3) |
| this.F(3, "x") | (call (dot this f) 3 "x") |
| (String) x + y | (cast string (+ x y)) |
| predicate ? consequent : alternative | (? predicate consequent alternative) |
| x = 3 | (= x 3) |
Now let's look at each of these kinds of expressions.
literal expressions are defined as (Scheme) numbers, booleans, characters, strings, and the special names FALSE, NULL, and TRUE. A java-name (java-name?), which is the Decaf Java-In-Scheme equivalent of a Java variable, is any symbol that isn't one of the "magic" words of the Decaf Java-In-Scheme language. For reference, here are the names you can not use for variables (or anything else, for that matter) in Decaf Java-In-Scheme:
!= * *= *BREAK-LABEL* *CONTINUE-LABEL* *RETURN-LABEL* + += - -= / /= < <= = == > >= ? ABSTRACT AND BLOCK CALL CAST CLASS DO DOT EXTENDS FALSE FIELD FINAL FOR IF IMPLEMENTS INSTANCEOF INTERFACE METHOD NEW NEWLINE NULL OR PRINT PRIVATE PROTECTED PUBLIC STATIC SUPER THIS TRUE VARIABLES WHILE
A new expression is used to create new objects, and looks like (NEW <type name>). In Java, arguments can be used to choose which of several different "object constructors" should be called; but this is one of the simplifications we've made: since we don't support the declaration of explicit object constructors, we don't support the syntax for calling them, either.
A dot expression is used for Java constructs like this.F or super.X or A.B.C.X. In Decaf Java-In-Scheme these are written as (DOT this F), (dot super X), or (Dot A b c x) respectively (remember that Decaf Java-In-Scheme doesn't care about upper/lower case distinctions like Java does).
A call represents Java's method invocation. In Decaf Java-In-Scheme we write (CALL F 3) when in Java we would write F(3);. Similarly, we write (CALL (dot this F) 3 "x") when in Java it would be this.F(3, "x");. A cast is written as (CAST String (+ X Y)) instead of Java's (String) X+Y;. Similarly, we write (INSTANCEOF (+ X Y) String) instead of X+Y instanceof String.
A built-in corresponds to Java's set of "infix operators". In Decaf Java-In-Scheme, we support the following binary operators, corresponding directly to their Java counterparts: != * + - / < <= == > >=. As mentioned earlier, Decaf Java-In-Scheme doesn't handle coercion of operands for these built in operations, so be sure to cast the operands if that is required. The following is a legal Decaf Java-In-Scheme expression: (+ X (* Y Z)). Notice that (+ X Y Z) is legal in Scheme, but not in Decaf Java-In-Scheme (Decaf Java-In-Scheme's + operator is binary, while Scheme's takes any number of arguments). Also, + in Decaf Java-In-Scheme is like Java's +, not Scheme's: it will work when both arguments are numbers or when both arguments are strings.
In addition, we support (? <predicate expr> <consequent expr> <alternative expr>) corresponding to Java's <predicate expr> ? <consequent expr> : <alternative expr>. We also support (AND ...) where there can be any number of expressions for ..., corresponding to Java's && operator as well as (OR ...) for Java's || operator. We've added two built-in operators, (newline) and (print <expr>).
Finally, an assignment expression allows us to modify the value of an existing variable (like Scheme's SET!, but with a defined value returned). Decaf Java-In-Scheme, like Java, supports a number of different assignment operations. In Decaf Java-In-Scheme these are = *= /= += -=. For example, (= X 3) changes the value of the variable X to 3 and returns the value 3.
Decaf Java-In-Scheme Statements
The procedure statement? specifies what constitutes a Java statement, that is, what Decaf Java translations into Scheme represent legal statements:
(define (statement? obj)
(or (expr? obj)
(and (pair? obj)
(case (car obj)
((BLOCK) (all? statement? (list-tail obj 1)))
((BREAK) (break? obj))
((CONTINUE) (continue? obj))
((DO) (do? obj))
((FOR) (for? obj))
((IF) (if? obj))
((LABEL) (label? obj))
((RETURN) (return? obj))
((VARIABLES) (variables? obj))
((WHILE) (while? obj))
(else #F)))))
In addition to expressions, the following syntaxes are supported. Notice that Java's switch isn't supported (yet...)
We stress once more that the descriptions below identify what you write in Decaf Java-In-Scheme as being equivalent to Java expressions, since our interpreter requires Scheme based syntax as input.
Decaf Java-In-Scheme Declarations
There are only two declarations in Decaf Java-In-Scheme, one for classes and one for interfaces, and in this problem set we are ignoring interfaces. The following Decaf Java expression corresponds to the same construct in Java:
(CLASS [(modifier ...)] <name>
[(EXTENDS <name> ...)]
[(IMPLEMENTS <name> ...)]
<body form> ...)
(INTERFACE [(modifier ...)] <name>
[(EXTENDS <name> ...)]
<body form> ...)
The entries inside brackets ("[", "]") are optional -- you can leave them out if you want. The ... means that you can repeat the thing before it as often as you like.
The modifer must be one of the set allowed in Java: PUBLIC PROTECTED PRIVATE STATIC ABSTRACT FINAL. (It doesn't make sense to repeat a modifier, but it doesn't matter and Decaf Java-In-Scheme won't complain.) Note: In order to make it easier to deal with these complicated syntaxes, the procedures in java-syntax.scm are a little different from those in the book. Several of them take in the syntax that we're showing you here and return a more stylized syntax that includes entries for everything you could have put in, whether you put it there or not. This is called "canonicalizing" the parse tree and is quite common practice in compilers and interpreters. It makes the predicates for the syntax a bit more complex (they have to do the canonicalization) but the result is that the selectors are simpler and faster.
A body form can be one of three things:
(METHOD [(<modifier> ...))]
<type name> <method name> (<parameter> ...)
<statement>)
where a parameter is (<type name>
<parameter name>).
For example, here is a Decaf Java method:
(METHOD (public static) void main () (call (DOT System out println) "Hello, world"))
The corresponding Java code is:
public static void main()
{
System.out.println("Hello, world");
}
(FIELD [(<modifier> ...)]
<type name> (<variable declaration> ...))
.
Here is a Decaf Java-In-Scheme example:
(class parent
(field string ((= bar "Parent bar")))
(method void foo ((string s))
(block
(print (+ s (+ ": Parent foo, " (dot this bar))))
(newline)))
(method void foo2 ((string s))
(block
(print (+ s (+ ": Parent foo2 called, "
(dot this bar))))
(newline)
(call (dot this foo) s))))
and here is the corresponding Java code:
class Parent
{
String bar = "Parent bar;
void foo (String S)
{
System.out.println(S + ": Parent foo, " + this.bar);
}
void foo2 (String S)
{
System.out.println(S + ": Parent foo2 called, " + this.bar);
this.foo(S);
}
}
class NumericFun
{ int TheNumber;
public void SetNumber(int n)
{ TheNumber = n;
}
private int Fib(int n)
{ if (n==0) return 0;
else if (n==1) return 1;
else return Fib(n-1)+Fib(n-2);
}
public int Fib()
{ return Fib(TheNumber);
}
}
To help you get started in the Java-In-Scheme version, here's a
partial template for you to complete:
(define NumericFun
'(class NumericFun
(field number (TheNumber))
(method (public) void SetNumber ((number n))
(= TheNumber n))
(method ...) ; Fib (n)
(method ...))) ; Fib ()
(run-exprs
(list NumericFun
'(global NumericFun
(= Nine (new NumericFun))
(= Ten (new NumericFun)))
'(call (dot Nine SetNumber) 9)
'(call ...)
'(print (call ...)) ; (fib 9)
'(newline)
'(print (call ...)) ; (fib 10)
'(newline)))
(method number fact-fn ((number n))
(variables number ((= result 1))
(block
(for (number (= i 2)) (<= i n) (+= i 1)
(*= result i))
(return result))))
class Rotating
{
int start = 0; // variable to hold state of object
int end; // variable to determine last value before
// going back to 0
Rotating(int limit)
{
end = limit;
}
void Shift()
{
start += 1;
if (start > end)
{
start = 0;
}
}
int Status()
{
return start;
}
}
If this is done consistently throughout the interpreter the flow-altering operations just call some procedure other than the one they are told to call, and the effect is that the program runs in a different order. This is quite subtle, and you don't have to worry about understanding it in detail. Just bear in mind that evaluation procedures (including j-eval and j-eval-in-order) do not return values to the procedure that called them, but call their next procedure instead.
The main procedure of the Decaf Java-In-Scheme interpreter is j-eval:
(define (j-eval exp env next)
;; EXP is a Decaf Java-In-Scheme expression.
;; ENV is an environment (see abstraction below)
;; (NEXT value) is called when the value has been computed
(cond
;; Expressions
((literal? exp) (j-eval-literal exp env next))
((variable? exp) (j-eval-variable exp env next))
...
;; Statements
((variables? exp) (j-eval-variables exp env next))
((if? exp) (j-eval-if exp env next))
...
;; Handle declarations specially
((global? exp) (j-eval-global exp env next))
((quick-class-test? exp)
(let ((full-class (class? exp)))
(if full-class
(j-eval-class full-class env next)
(error "Bad syntax for CLASS" exp env))))
((quick-interface-test? exp)
(let ((full-interface (interface? exp)))
(if full-interface
(j-eval-interface full-interface env next)
(error "Bad syntax for INTERFACE" exp env))))
(else (error "Unknown type of expression" exp env))))
For the most part, this just calls the syntax testing procedures and
dispatches to the appropriate handling procedure. The only exception
is the handling of class and interface where the
syntax procedure (as mentioned earlier) canonicalizes its
input. In this case, the canonicalized input is passed to the handler
instead of the original expression.
Let's take a look at a few representative evaluation procedures. We'll start with the very simplest, j-eval-literal:
(define (j-eval-literal exp env next) (next (if (eq? exp 'NULL) '() exp)))All it has to do is figure out the value of the expression exp and call next with that value. That's easy: for any literal except the NULL the value of the literal is itself. We represent Decaf Java-In-Scheme's NULL with Scheme's empty list, as shown.
OK, how about something harder? Let's take a look at how if is handled:
(define (j-eval-if exp env next)
;; (IF <test> <consequent> <alternative>)
(j-eval (if.predicate exp) env
(lambda (pred) ; (1)
(cond ((eq? pred #T)
(j-eval (if.consequent exp) env next))
((eq? pred #F)
(j-eval (if.alternative exp) env next))
(else (error "IF: Not a boolean value"))))))
The structure is slightly different from the version we've used for
Scheme. Remember that j-eval doesn't return a value.
Instead, it calls its third argument with the value it has computed.
So how do we read this procedure? It starts by calling
j-eval to compute the value of the (if.predicate
exp) in the current environment. The value of this
expression is passed to the procedure that's marked (1) as
the parameter pred. Notice how this defines a new
next procedure, thereby passing the evaluation along.
This procedure checks to see if
pred is true, false, or something else. If
it's true, then we call j-eval again to compute the value of
the (if.consequent exp) in the same environment,
and tell it to pass the value on to the procedure named
next. This is important: remember that j-eval
won't return a value -- it will call next instead.
Does that make sense? Let's see. We were supposed to compute the value of some big if expression and call next with the answer when we figure it out. So first we found out that the predicate of the if was true. Then we decided to call j-eval to find the value of the consequent and we've told j-eval that it should pass that answer on to next. Aha! That's clever. That means that we don't get back into the loop of handling the result any more -- we've simplified our original problem (finding the value of the if) to the problem of finding the value of the consequent of the if, and we can turn that completely over to someone else (j-eval) to take care of. This is what is known as reducing the problem to a simpler case. In this style of interpreter, we always try to reduce problems when we can, and we know that we've reduce it when we can call j-eval with our own next procedure as its next procedure (technically, these procedures are known as "continuations").
Now let's take a look at an evaluation procedure for a statement. Here's the one for block:
(define (j-eval-block exp env next)
;; (BLOCK <statement> ...)
(define (loop exprs)
(if (null? exprs)
(next "End of block")
(j-eval (car exprs) env
(lambda (val) (loop (cdr exprs))))))
(loop (cdr exp)))
Notice that when it calls j-eval it does not use its
own continuation (next); instead it uses the procedure
created by (lambda (val) (loop (cdr exprs))). This means
that evaluating each statement within the block is a subproblem
of the original problem. It isn't until after the last statement is
evaluated that we actually use our own continuation to return a value.
Let's take a look at a few more evaluation procedures and then we can start the actual problem set (you can't wait, right?). Let's take a look at how labels work in this interpreter. First, how do we create a label? That's what the Decaf Java-In-Scheme (LABEL <statement>) syntax is for. Here's how it is implemented:
(define (j-eval-label exp env next)
;; (LABEL <name> <statement>)
(let ((name (label.name exp))
(labels *labels*)) ; (1)
(define (return-here) ; (2)
(set! *labels* labels)
(next "End of labelled block"))
(add-label! name return-here) ; (3)
(j-eval (label.statement exp) env ; (4)
(lambda (val) (return-here)))))
(define *labels* '())
(define (add-label! name label)
(set! *labels*
`(,(make-binding name '*LABEL* label)
,@*labels*)))
It is a bit ugly, but all it really does is change the value of the
global variable *labels* by adding one new binding to the
front of the list -- a binding from the label name to the procedure we
specify (return-here in our case) with a type of
*LABEL*.
(define (j-eval-break exp env next) ;; (BREAK) or (BREAK) (exit exp '*BREAK-LABEL*)) (define (exit exp default-label) ;; Evaluate the procedure associated with a given label, supplying a ;; default label if none is specified to use. Used to implement ;; (BREAK) and (BREAK <label>), etc. (let ((name (if (null? (list-tail exp 1)) default-label (second exp)))) ((lookup-label name)))) ; (1) (define (lookup-label name) (lookup-global name *labels* "No such label")) (define (lookup-global name global-var error-msg) ;; Find the value of a name in a global name space (let ((entry (assq name global-var))) ; (2) (if entry (binding.value entry) ; (3) (error error-msg name))))
Pay particular attention to how things hook into the read-eval-print loop. You can start it by calling (driver-loop) or you can use go to specify some code to evaluate before the loop begins. A simple model you might have is that when you type into the loop, is that you are actually typing code that is part of a static method defined in an unnamed class that's global to the whole program. This idea lets you access other classes, and so forth.
To make it a bit more convenient, we've added a new statement to the Decaf Java-In-Scheme language:
(GLOBAL <type> <VariableDecl> ...)This allows you to declare new variables that are visible from the read-eval-print loop. With this in place, you can run a simple factorial example like the one in test2:
(define (test2 n)
(run-exprs
`((CLASS fact
(METHOD (public static) number fact-fn ((number n))
(VARIABLES number ((= result 1))
(BLOCK
(FOR (number (= i 2)) (<= i n) (+= i 1)
(*= result i))
(RETURN result)))))
(global fact (= x (new fact)))
(call (dot x fact-fn) ,n))))
That's roughly equivalent to the following Java program:
class fact
{ method (public static) int fact-fn (int n)
{ number result=1;
for (int i=2; i <= n; i += 1) result *= i;
return result;
}
}
class REPLoop
{ fact x = new fact();
x.fact-fn (n);
}
You can use it like:
(/ (test2 10) (test2 9)) ==> 10
(define (j-eval-while exp env next)
;; (WHILE <boolean expr> <statement>)
(let ((labels *labels*) ; (1)
(bool-expr (while.predicate exp))
(statement (while.statement exp)))
(define (done) ; (2)
(set! *labels* labels)
(next "End of WHILE"))
(define (again) ; (3)
(j-eval bool-expr env ; (4)
(lambda (bool) ; (5)
(cond ((eq? bool #T)
(j-eval statement env
(lambda (ignored-value) (again))))
((eq? bool #F (done)))
(else (error "WHILE: Not a boolean value"))))))
(add-label! '*BREAK-LABEL* done) ; (6)
(add-label! '*CONTINUE-LABEL* again)
(again))) ; (7)
But what happens in Java? If you read the reference manual carefully you will see that the rules for variables and the rules for methods are very different. Finding the value of a variable is very similar to what Scheme does: there is a chain of environment frames, and they are searched in order to find the value of the variable. In Decaf Java-In-Scheme there are four ways for variables to be created and given values:
Object frames are created by the new expression. The other three parts of the object frame are the object's class (an ev-class structure); the "superobject" (that is, an object frame that has the dynamic fields of this object's class's direct superclass); and the object itself (i.e. the value of this inside any of this object's own methods). We won't be concerned with these other parts yet. So how do we actually search for the value of a variable (a similar set of rules applies for handling dot references). We start at the current frame and search it (using assq) for a binding for the name. If it's there, the binding will supply both the type and the value of the variable. If it's not there, we proceed to the parent frame. This continues until we get to the "base frame". For each kind of base frame there is a special rule that determines how to continue the search.
(define (find-binding var env next)
;; This searches only the runtime bindings created by calling
;; methods and the fields (not methods) from classes and interfaces.
(let ((the-object #F))
(define (loop env)
(let ((frame (first-frame env)))
(let ((entry (assq var (frame.bindings frame))))
(or entry
(case (first frame)
((BINDINGS FOR VARIABLES)
;; Simplest case: try the next frame in the chain
(loop (rest-of-frames env)))
((GLOBAL)
;; Base case 1: Global environment, variable not found
#F)
((CLASS)
;; Base case 2: Class, first look at the interfaces,
;; then either the superclass or the superobject
(or
(any loop
(map basic-environment (ev-class.implements frame)))
(let ((super (if the-object
(object.superobject the-object)
(ev-class.superclass frame))))
(and super
(not (predefined-class? super))
(loop super)))))
((INTERFACE)
;; Base case 3: Interface, look at inherited interfaces
(any loop (map basic-environment
(interface.implements frame))))
((OBJECT)
;; Base case 4: Object, look at the class, but
;; remember that we've found an object so that parent
;; dynamic fields are considered as we go up the
;; chain of superclasses
(set! the-object frame)
(loop (basic-environment (object.class the-object)))))))))
(let ((result (loop env)))
(if result
(next result)
(error "Variable not found" var)))))
How is method lookup handled? Take a look at the following code, the
main part of method-lookup in java-support:
(define (method-lookup method-name arg-types env next)
...definitions...
(find-object-initial-class-and-name ; (1)
(lambda (object initial-class name) ; (2)
(find-all-matching-methods object initial-class name ; (3)
(lambda (methods signatures classes) ; (4)
(find-best-method object methods signatures classes)))))) ; (5)
(define (supertype-loop desired-type type count)
;; Returns 0 for same type, 1 for one step away, etc., or #F
;; if type-name isn't a supertype of type
(if (eq? type desired-type)
...
(let ((super (ev-class.superclass type)))
(if (ev-class? super)
...
...))))
(define (decide new best decision)
;; We start out TIEd, and only change that if we find
;; something definitely better or definitely worse. If we
;; have to change our mind from NO to YES (or vice versa)
;; then it's really a TIE.
...)
Complete the code and put it into java-testing. Hand in the
code and evidence that it works. You might want to try
(method-test) with the version of the code we've supplied and
then compare the results to what you get.
Hints:
(define (j-eval-variables exp env next)
;; Moved to JAVA-WORKING
;; (VARIABLES <type> (<VariableDecl> ...) <statement>)
(let ((vdecls (variables.var-decls exp))
(type-name (variables.type exp)))
(let ((names (map var-decl.name vdecls))
(type (lookup-class type-name))
(init-exprs
(map (get-initializer type-name) vdecls)))
;; INIT-EXPRS is a list of Decaf Java-In-Scheme expressions that must
;; be evaluated to initialize the new variables.
(j-eval-in-order init-exprs env
(lambda (init-vals)
;; Extend the environment by creating bindings for the new
;; variables (including "assignment conversion" of the
;; values), and then in this extended environment evaluate
;; the statement.
.......... YOUR CODE HERE ...........)))))
class SuperClass
{ int X;
}
class SubClass extends SuperClass
{ void Test2(SuperClass x)
{ System.out.println("Superclass arg.");
}
void Test2(SubClass x)
{ System.out.println("Subclass arg.");
}
}
class Lynn
{ public static void main( String[] args )
{ SubClass sub = new SubClass();
SuperClass sup = new SuperClass(),
subAsSup = sub;
System.out.print("Test2(sup)"); sub.Test2(sup);
System.out.print("Test2(sub)"); sub.Test2(sub);
System.out.print("Test2(subAsSup)");
sub.Test2(subAsSup);
}
}
This produces the following output:
Superclass arg. Subclass arg. Superclass arg.It is this final line that's odd. The variable subAsSup is declared to have type SuperClass, so the compiler decides that it must call a procedure whose signature is "method from an object of type SubClass with an argument of type SuperClass". But at runtime, the actual value of the argument is SubClass, not SuperClass. But the compiler decided to call the method designed for SuperClass objects, and that's what gets called.
If you try this in Java-In-Scheme, you'll get the same result. But we could make the argument (and many object-oriented languages have done so) that this is the wrong behavior. We really want to call the method that best matches the argument being passed: the one that would pring "Subclass arg." in this case.