This page tries to provide answer to questions that have come up during the interpreter segment of the course.
What is the point of this portion of the course?
We are moving into studying programming languages (as opposed to just functional programming). Implementing programming languages is a great way to study them. Therefore, this portion of the course is about how to implement programming languages.
What do we need to do in order to implement a programming language? We need to write a program that runs programs in the language we want to implement. The first step is to come up with a data model for programs. Why? Consider the following analogy:
When we wanted to write programs to manage a zoo of animals, we needed data structures to capture animals and zoos.
When we wanted to write programs to process family trees, we needed data structures to capture family trees.
If we want to write programs to process programs, we need data structures to capture programs.
How do we come up with a data model for programs?
There are several possible data models for programs. One that comes to mind quickly is a string (or a list of characters). This is a poor choice if we want to implement the language though, because to run the program, we need to know what programming constructs (loops, if-statements, arithmetic, etc) the program used. Tossing an entire program into one string obscures this structure.
As an analogy, do you think of a textbook as a string? We generally think of texts as structured into chapters and sections (imagine giving assignments as "do the exercise starting at the 15,467th character in the book" as opposed to "section 1.3, problem 2"). The string obscures the structure of the text, just as it would obscure the structure of the program.
To come up with a data model for a program, we need to look at the constructs in the program. We then need a way to capture each construct (and its pieces) as data. So, to capture binary addition (such as 4 + 5), we need a struct like (make-plus 4 5). The struct yields the data that captures the original program.
To answer this in more detail, we need a specific language to implement. Enter Curly. See the next section on the Curly data definition for more details.
What's the deal with this Curly, when is it going to fade out, and can you give us links to info on it?
Curly is the simple language we are trying to implement. Curly is derived from Curl, an internet programming language (see www.curl.com for information on Curl; this page won't help you with Curly though). Curl itself has been heavily influenced by Scheme, so the programs and style of evaluation in Curl strongly resembles that of Scheme. Curly is a slightly modified version of a small subset of Curl.
Traditionally, 2135 has asked students to implement a portion of Scheme. However, many students get confused with the idea of implementing Scheme in Scheme. To avoid this confusion, we are implementing Curly instead. However, the only real difference between Scheme and Curly lies in the syntax.
We will stick with Curly through the rest of this section of the course. If it helps you to think in terms of the corresponding Scheme programs, go ahead and do so.
What specifically are we trying to do/accomplish with this program (eval)?
Eval is an interpreter. Eval takes an expression (which captures a Curly program), runs the program, and returns the answer that the program produces.
So, let's say we wanted to run the Curly program
{{proc {x} {return x + 5}} 4}
[note: this is like the Scheme program ((lambda (x) (+ x 5)) 4)]. We expect to get the answer 9. Thus, if we use run this program through eval, we should get 9. i.e.:
(eval (make-apply (make-proc 'x (make-plus (make-var 'x) 5)) 4))
Are we writing a compiler?
No. We are writing an interpreter (aka evaluator). An interpreter is a program that takes a program and produces a value (the "answer" from running the program). A compiler is a program that takes a program and produces another program (often in a different language). Compilers don't evaluate/run their input program; interpreters do.
Let's look at how to derive a data definition for Curly. We'll start with a simple Curly program:
{define f {proc {y} {return y + 7}}} {f {7 * 2}}Our data definition needs a structure to model each construct in this program. What constructs are there?
Keeping definitions separate from other expressions (programs we can run), this gives rise to the following two data definitions (with their define structs):
An expr is one of - number - (make-var symbol) - (make-proc symbol expr) - (make-plus expr expr) - (make-mult expr expr) - (make-apply expr expr) A definition is a (make-def symbol expr) (define-struct var (name)) (define-struct proc (param body)) (define-struct plus (left right)) (define-struct mult (left right)) (define-struct apply (func arg)) (define-struct def (name exp))
We now have a way to turn Curly programs into data so that we can manipulate the programs. Here are two examples:
{define f {proc {y} {return y + 7}}} {f {7 * 2}} (make-def f (make-proc 'y (make-plus (make-var 'y) 7))) (make-apply (make-var 'f) (make-mult 7 2))
{f (3 + {g {(2 + 8) * 5}})} (make-apply (make-var 'f) (make-plus 3 (make-apply (make-var 'g) (make-mult (make-plus 2 8) 5))))
Now, we're in a position to write eval, which takes an expr (representing a Curly program) and returns the result we would get from running that program. See the code for eval as of class on Friday, Sept 21.
These answers refer to the Curly data definition and the code for eval.
What is the apply struct used for (and how is it used)?
The apply struct captures calls to functions (as opposed to definitions of functions, which the proc struct captures). See the examples above for how it is used.
What is the proc struct used for?
The proc struct captures definitions of function (as opposed to calls to functions, which the apply struct captures). See the examples above for how it is used.
What are the var? etc in the eval/subst functions?
var?, plus?, etc are created by the define-structs that we created to capture Curly programs. We use them to determine which language construct an expression corresponds to.
Why do we have to replace the parameter with the argument in the body of the function?/Why do we need subst?
Curly programs evaluate in similar fashion to Scheme programs. Consider the program:
{define f {proc {y} {return y + 7}}} {f {7 * 2}}
We evaluate this by computing the value of the argument to f, substituting that value for y in the body of f, then evaluating the resulting (substituted) body expression. As a sequence of Curly programs, this yields:
{f {7 * 2}} {f 9} {return 9 + 7} 16
This is the same sequence of steps we would have used in Scheme. We have to replace the parameter with the argument because that is what happens when we call a function (ie, that is the rule by which a call to a function yields an answer). The subst function handles replacing the parameter with the argument, just as we have been doing all-along by hand.
As for why we need a separate subst function, we need to process the entire expression (to substitute everywhere), which means we need to explore the expression recursively. A new recursive operation requires a new function. Put another way, use a separate function for each computation on the same datatype (eval is one computation on expr, subst is another).
Why do we return an error in the var case?
Variables get their values when arguments are passed to functions (the apply case). In the apply case, we replace all uses of a parameter with the argument (through subst). Therefore, if we ever encounter a var, it wasn't a parameter to a function (ie, it wasn't in the scope of any procedure). This causes an unbound variable error, because the variable has no value.
Why does (eval (apply-func exp)) in the (apply? exp) case return a proc and not a number?
We agreed that we will run eval only on programs that are well-typed. This means that we won't try to add/multiple procs, and we won't try to call numbers as functions. Nothing in the current code enforces this -- it falls under the "assume no need for error checking" clause that we've used all term.
So why the need for the error case for var? in eval? Eval is a function on expr, so it must follow the expr data definition. This requires it to have a cond clause for var, and the error is the only reasonable answer to put in that clause.
Note: You could certainly add type checking to your evaluator, but that's another topic for another course (though we've covered enough at this point for you to add simple type checking to your evaluator if you wanted to).
The language expansion seems to go on and on -- when does it finish?
Arithmetic was the first step. Adding functions was the second. Functions required adding vars, proc and apply, and we included define since you are used to giving functions names. It's not a never-ending process, we just added something that required multiple data definitions to implement. On homework, you'll add two more pieces, enough to allow you to capture recursive functions. That's all for new constructs, though we haven't finished handling definitions just yet ...
This page maintained by Kathi Fisler Department of Computer Science Worcester Polytechnic Institute |