0. Announcements Slide 1 Final Exam Johnson Athletic Center (not what the Registrar said!), Wed. May 21, 1:30pm to 4:30pm. 1. Introduction, 5 mins., 10:05 - 10:10am "Implementing Scheme on real computers" is the current theme. So far, we have computers with 7 registers, a stack, a smattering of built-in operations, and a controller. Slide 2: 7 registers, stack, operators, bus, controller Board: Draw top board, with 7 registers Real computers have some number of registers, a smattering of built-in operations, and a bunch of "addressable memory". Today we want to see how to implement a stack and Scheme's CONS/CAR/CDR using traditional computer memory. We'll also see how and why Scheme and Java need to have a "garbage collector" and how it works. 2. Computer memory and the stack, 5 mins., 10:10 - 10:15am Slide 3: Memory We buy computer memory in large amounts. It is divided into "addressable chunks" which can be individually read and written. For our purposes, we will assume that the chunk is wide enough to hold a Scheme number or the address of another chunk of memory. This isn't the way it really works, but it's a good approximation. We build a stack by using a new register, called STACK, that has the address of an area of memory that we plan to use of this purpose. When we SAVE items on the stack we subtract (or add) to this address and use that as the new address for the stack. Slide 4: Computer with Stack in Memory Board: Draw middle board, with memory but not heap/stack division Transition: now let's take a look at how to implement CONS, CAR, and CDR. 3. Memory Layout 3 mins., 10:15 - 10:18am Let's do an example on the board (bottom board) (set! z (cons 1 (cons 2 3))) What does CONS do? It has to find two new spots for the CAR and the CDR. We could use the stack, but that would interfere with our using it to save and restore registers, since the usage pattern isn't the same. Instead, we add yet one more register, FREE, and we split up the memory of the computer into three areas. Board: draw the division. Now, all CONS has to do is to put two things on the heap. It uses the FREE register to find out what's free, and then increments the FREE register by two. Add lines and discuss stack growing down, heap growing up, evaluator fixed in memory. Notice that we can assume that the environment frames are made exactly as we've done in the past: they are made using CONS, CAR, CDR, etc., so they are stored in the heap just like any other data structure. 4. Typed values and example, 5 mins., 10:18 - 10:23am We'll need to tell, somehow, if the contents of a memory location is a number or a CONS cell or something else. So we have the notion of typed data. Slide 4: Typed Data (CONS 2 3): puts 2 and 3 on the heap increments the free pointer from 300 to 301 and then 302 and puts CONS 300 into VAL. (CONS 1 ...): puts 1 on the heap at 302, and CONS 300 on the heap at 303 increments the free pointer to 304. puts CONS 302 into VAL. This value goes into the environment, somewhere around 100, as the value of z. 5. Garbage Collection: the problem, 3 mins., 10:23 - 10:26am Board: (set-cdr! z z) This changes the contents of 303 to be CONS 302. Now, we can't just RESTORE to remove the now-unused stuff at 300 and 301, because we've got something at 302 and 303 still in use. We've created garbage. Notice that not only do programs create garbage, but take a careful look at how both the explicit control evaluator and the compiler output code work. They create procedure objects on the heap, as well as environment frames. Java creates objects on the heap. In all of these cases, we need a consistent way to get rid of the things that were created but aren't needed any more. 6. What's required for GC?, 10 mins, 10:26 - 10:36am We have to have a good definition of "garbage". We can use one based on our observation of the register machine itself: the state of the register machine, we've said, is the contents of the registers plus the stack. We've built into our system the fact that saving these is enough to save the entire state of the system. So that means we can start with these registers, copy everything they point to, recursively, and that's everything the computer could get to. Let's assume that we somehow discover we're running low on space in the heap. We grab the registers and make a LIST out of their contents. How would we got about copying this? Slide 6: (define (copy-list-1 l) (cond ((null? l) '()) ((pair? l) (cons (copy-list-1 (car l)) (copy-list-1 (cdr l)))) (else l))) Doesn't work because it changes the sharing relationships. Also doesn't work on circular data structures. Also it's recursive so it uses up stack space which may be limited. Slide 7: (define (copy-list-2 l) (let ((tag (list 'BEEN-HERE-DONE-THAT))) (define (copy l) (cond ((null? l) '()) ((and (pair? l) (not (eq? (car l) tag))) (let ((the-car (car l)) (the-cdr (cdr l)) (result (cons 0 0))) (set-car! l tag) (set-cdr! l result) (set-car! result (copy the-car)) (set-cdr! result (copy the-cdr)) result)) ((pair? l) (cdr l)) (else l))) (copy l))) This version works. Unfortunately, it's recursive. 7. Stop-and-Copy GC, 15 mins., 10:36 - 10:51am Draw the example on the board. Idea: we need to copy all of the "good stuff" into a new area. So let's divide the heap into two parts, one in use until it fills up, then copy the good stuff over into the other, and keep on going. How do we do the copy? We'll use two new registers, FREE and SCAN. We'll have three invariants: 1) everything in the new part of member with an address below SCAN is needed and is perfectly ready for use. 2) everything with an address at or above FREE is not in use. 3) everything between (and including) SCAN and (but not including) FREE is needed, and is a perfect copy of what it was before it was moved. When SCAN==FREE, we are done, because everything at and above SCAN is free, everything below SCAN has been fixed. We start by copying all of the regular registers into the bottom of the free area of memory, pointing SCAN at the first location in the new area, and putting FREE just above the stuff just copied. Note that the invariants are true now (nothing below SCAN). Each step of the algorithm moves SCAN forward by one address. In order to do this, we have to take the current thing that SCAN points to and make it ready for use in the new area. There are three cases: a) We are scanning a number or boolean (i.e. not a pointer to anything else), so just move the SCAN pointer forward. b) We are scanning a pointer, and it must point to the old space. We look there and see if it has a forwarding pointer. i) if so, then it's already been moved so we just need to change the scanned item to point to where it now lives. ii) if not, then we have to move it. Copy the thing it points to into wherever FREE points (and move FREE forward). Replace the original value that SCAN pointed to with a forwarding pointer. 8. Conclusion, 4 mins, 10:51 - 10:55am You now have everything you need to make Scheme, or Java, run on a simple computer. All we are assuming is that there are about 7 registers, some simple functions, and a large memory. We put the stack into the memory, as well as the heap. We use the heap to store registers temporarily, and we use the heap for everything else: CONS cells (including the environment), PROCEDURES, strings, very large numbers (bignums), arrays in Java. The stack grows and shrinks as we save and restore registers. The heap grows as we create new objects and shrinks when we garbage collect because it ran out of space to grow. The garbage collector we've seen is a "stop and copy" collector. There are variants that do a bit of copying interspersed with running, to get closer to real time. There are variants that garbage collect some parts of memory more often than others. There are variants that collect faster but slow down other operations. Lots of ideas to explore.