You can extract languages in a step-by-step fashion starting from a naive implementation of the program. We motivated this in class with a naive slideshow program (similar to that in the notes): (begin (print-string "------------------------------") (print-string "Hand Evals in DrScheme") (print-string "Hand evaluation helps you learn how Scheme reduces programs to values") (print-string "------------------------------") (await-click) (print-string "------------------------------") (print-string "Example 1") (print-string "(+ (* 2 3) 6)") (print-string "(+ 6 6)") (print-string "12") (print-string "------------------------------") (await-click) (print-string "------------------------------") (print-string "Summary: How to Hand Eval") (print-string "1. Find the innermost expression") (print-string "2. Evaluate one step") (print-string "3. Repeat until have a value") (print-string "------------------------------") (await-click) (print-string "end of show") ) Languages provide a shorthand for common patterns that programmers want to write. To identify the pieces of the language, start by identifying the common patterns. Where are the common patterns in this sample? There appears to be one for printing slide bars and contents followed by clicks: (begin ========================================================================================== | (print-string "------------------------------") | | (print-string "Hand Evals in DrScheme") | | (print-string "Hand evaluation helps you learn how Scheme reduces programs to values") | | (print-string "------------------------------") | | (await-click) | ========================================================================================== ========================================================================================== | (print-string "------------------------------") | | (print-string "Example 1") | | (print-string "(+ (* 2 3) 6)") | | (print-string "(+ 6 6)") | | (print-string "12") | | (print-string "------------------------------") | | (await-click) | ========================================================================================== ========================================================================================== | (print-string "------------------------------") | | (print-string "Summary: How to Hand Eval") | | (print-string "1. Find the innermost expression") | | (print-string "2. Evaluate one step") | | (print-string "3. Repeat until have a value") | | (print-string "------------------------------") | | (await-click) | ========================================================================================== (print-string "end of show") ) Each common pattern becomes a command (aka operation) in the language. When you find a common pattern, choose an operation name that captures what the pattern is doing. This pattern appears to be printing or displaying slides, so we'll call this pattern "display". ;; A command is a ;; - (make-display ...) What is the argument to the display command? For this, look at the boxes and determine where their contents are different. Create a struct with as many parts as their are differences -- this struct becomes part of the data of the language, and the argument to the operator. ;; A slide is a (make-slide string slide-body) (define-struct slide (title body)) ;; A slide-body is either ;; - string ;; - (make-pointlist list[string] boolean) (define-struct pointlist points numbered?) ;; A command is a ;; - (make-display slide) You can now replace each box with the corresponding command and data to start creating the program in your language: (begin (make-display (make-slide "Hand Evals in DrScheme" "Hand evaluation helps you learn how Scheme reduces programs to values")) (make-display (make-slide "Example 1" (make-pointlist (list "(+ (* 2 3) 6)" "(+ 6 6)" "12") false))) (make-display (make-slide "Summary: How to Hand Eval" (make-pointlist (list "Find the innermost expression" "Evaluate one step" "Repeat until have a value") true))) (print-string "end of show") ) The code that formed the common pattern (the prints and await-clicks in this case) would all go into the interpreter for the command that corresponds to that pattern (the display command in this case). After you have replaced the boxes, repeat the process looking for other common patterns. When there are no common patterns left, create some data structures (such as lists) to hold the commands and move the remaining code from the naive version (the begin and end of show message) into the interpreter for the program. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Here's another example of language extraction using a naive version of the tax-preparation program (assume that store-line-data, lookup-line-data and add-line-data are helper functions that store and retrieve the users answers in some sort of table; compute-with-lines looks up the line numbers and calls an operation with them). As before, start by boxing off a common pattern: (begin (printf "Completing form 1040~n") =========================================== | (printf "Enter your name: ") | | (store-line-data 'name (read)) | =========================================== (let loop () (begin (printf "Enter dependent name: ") (let ([name (read)]) (if (not (symbol=? name 'done)) (begin (add-line-data 'deps name) (loop)))))) =========================================== | (printf "Enter your wages: ") | | (store-line-data '7 (read)) | =========================================== (printf "Enter taxable interest: ") (let ([ans (read)]) (begin (store-line-data '8a ans) (if (> ans 400) (begin (printf "Completing schedule B ~n") =============================================================== | (printf "Enter tax source ") | | (store-line-data 'B1 (read)))))) | =============================================================== =========================================== | (printf "Enter IRA distributions: ") | | (store-line-data '15a (read)) | =========================================== =========================================== | (printf "Enter other income: ") | | (store-line-data '21 (read)) | =========================================== (printf "Your total income is: ~a~n" (compute-with-lines 22 (list '7 '15a '21) +)) =========================================== | (printf "Enter IRA deductions: ") | | (store-line-data '26 (read)) | =========================================== (printf "Your adjusted gross income is: ~a~n" (compute-with-lines 35 (list '22 '26) -)) ) This pattern asks the user for data and stores the answer; what distinguishes the uses of this pattern is a line number and the text for that line. This yields the following data definitions: ;; A command is a ;; - (make-ask line) (define-struct ask (line)) ;; A line is a (make-line string number) (define-struct line (description number)) Let's replace the boxed pattern with the new command/data structs. (begin (printf "Completing form 1040~n") (make-ask (make-line "Enter your name: " 'name)) (let loop () (begin (printf "Enter dependent name: ") (let ([name (read)]) (if (not (symbol=? name 'done)) (begin (add-line-data 'deps name) (loop)))))) (make-ask (make-line "Enter your wages: " '7)) (printf "Enter taxable interest: ") (let ([ans (read)]) (begin (store-line-data '8a ans) (if (> ans 400) (begin (printf "Completing schedule B ~n") (make-ask (make-line "Enter tax source " 'B1)))))) (make-ask (make-line "Enter IRA distributions: " '15a)) (make-ask (make-line "Enter other income: " '21)) (printf "Your total income is: ~a~n" (compute-with-lines 22 (list '7 '15a '21) +)) (make-ask (make-line "Enter IRA deductions: " '26)) (printf "Your adjusted gross income is: ~a~n" (compute-with-lines 35 (list '22 '26) -)) ) Wait -- where did the old pattern code (the printf and the read) go? They still need to happen in order to prepare taxes, so we can't just get rid of them. Those lines become what the interpreter will do when it sees a make-ask command (taking the specific strings, etc out of the data structures we created). Let's build up the code for the command function in the interpreter as we go along, so you see how this would work. We'll build a run-command function that follows the templates for commands: ;; run-command : command -> void ;; executes a command in the tax language (define (run-command acmd) (cond [(ask? acmd) (let ([line (ask-line acmd)]) (begin (printf (line-description line) (store-line-data (line-number line) (read)))))] )) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Now, check for other common patterns. There seems to be one around computing line values, rather than asking the user for input: (begin (printf "Completing form 1040~n") (make-ask (make-line "Enter your name: " 'name)) (let loop () (begin (printf "Enter dependent name: ") (let ([name (read)]) (if (not (symbol=? name 'done)) (begin (add-line-data 'deps name) (loop)))))) (make-ask (make-line "Enter your wages: " '7)) (printf "Enter taxable interest: ") (let ([ans (read)]) (begin (store-line-data '8a ans) (if (> ans 400) (begin (printf "Completing schedule B ~n") (make-ask (make-line "Enter tax source " 'B1)))))) (make-ask (make-line "Enter IRA distributions: " '15a)) (make-ask (make-line "Enter other income: " '21)) =========================================================== | (printf "Your total income is: ~a~n" | | (compute-with-lines 22 (list '7 '15a '21) +)) | =========================================================== (make-ask (make-line "Enter IRA deductions: " '26)) =========================================================== | (printf "Your adjusted gross income is: ~a~n" | | (compute-with-lines 35 (list '22 '26) -)) | =========================================================== ) Since this is a different pattern from the ask for info pattern, we should make a new command for this. This data that distinguish the uses of the pattern are a description of the computed value, the line number, and the operation and lines to use in computing the value. The line structure covers part of this, so we can reuse the line structure as one argument to the command: ;; A command is a ;; - (make-ask line) ;; - (make-compute line list[symbol] (num .. num -> number)) (define-struct ask (line)) (define-struct compute (line input-lines op)) ;; A line is a (make-line string number) (define-struct line (description number)) Once again, replace the pattern boxes with uses of the new command: (begin (printf "Completing form 1040~n") (make-ask (make-line "Enter your name: " 'name)) (let loop () (begin (printf "Enter dependent name: ") (let ([name (read)]) (if (not (symbol=? name 'done)) (begin (add-line-data 'deps name) (loop)))))) (make-ask (make-line "Enter your wages: " '7)) (printf "Enter taxable interest: ") (let ([ans (read)]) (begin (store-line-data '8a ans) (if (> ans 400) (begin (printf "Completing schedule B ~n") (make-ask (make-line "Enter tax source " 'B1)))))) (make-ask (make-line "Enter IRA distributions: " '15a)) (make-ask (make-line "Enter other income: " '21)) (make-compute (make-line "Your total income is: " 22) (list '7 '15a '21) +) (make-ask (make-line "Enter IRA deductions: " '26)) (make-compute (make-line "Your adjusted gross income is: " 35) (list '22 '26) -) ) Move the pattern code that you replaced into the interpreter as the way to process the new command: ;; run-command : command -> void ;; executes a command in the tax language (define (run-command acmd) (cond [(ask? acmd) (let ([line (ask-line acmd)]) (begin (printf (line-description line) (store-line-data (line-number line) (read)))))] [(compute? acmd) (let ([line (compute-line acmd)]) (printf "~a~a~n" (line-description line) (compute-with-lines (line-number line) (line-input-lines line) (line-op line))))] )) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ At this point, we don't seem to have common patterns, but we do seem to have some complicated ones that do work that a tax application should automate. Let's look first at the loop that gathers names of dependents: (begin (printf "Completing form 1040~n") (make-ask (make-line "Enter your name: " 'name)) =================================================================== | (let loop () | | (begin (printf "Enter dependent name: ") | | (let ([name (read)]) | | (if (not (symbol=? name 'done)) | | (begin (add-line-data 'deps name) | | (loop)))))) | =================================================================== (make-ask (make-line "Enter your wages: " '7)) (printf "Enter taxable interest: ") (let ([ans (read)]) (begin (store-line-data '8a ans) (if (> ans 400) (begin (printf "Completing schedule B ~n") (make-ask (make-line "Enter tax source " 'B1)))))) (make-ask (make-line "Enter IRA distributions: " '15a)) (make-ask (make-line "Enter other income: " '21)) (make-compute (make-line "Your total income is: " 22) (list '7 '15a '21) +) (make-ask (make-line "Enter IRA deductions: " '26)) (make-compute (make-line "Your adjusted gross income is: " 35) (list '22 '26) -) ) Complicated patterns are also good sources of commands. This code seems to ask for information multiple times, so let's make an ask-multiple command to capture this pattern. What data might distinguish this code from another use of the same pattern? Looks like the info we're asking for and the line descriptor. That sounds exactly like the line data that we already have, so we can reuse that in the data definition: ;; A command is a ;; - (make-ask line) ;; - (make-compute line list[symbol] (num .. num -> number)) ;; - (make-ask-multiple line) (define-struct ask (line)) (define-struct compute (line input-lines op)) (define-struct ask-multiple (line)) ;; A line is a (make-line string number) (define-struct line (description number)) Note that the differences between how we'd handle lines asked once and those asked multiple times would go into the interpreter for the make-ask and make-ask-multiple cases. The program now looks as follows: (begin (printf "Completing form 1040~n") (make-ask (make-line "Enter your name: " 'name)) (make-ask-multiple (make-line "Enter dependent name: " 'deps)) (make-ask (make-line "Enter your wages: " '7)) (printf "Enter taxable interest: ") (let ([ans (read)]) (begin (store-line-data '8a ans) (if (> ans 400) (begin (printf "Completing schedule B ~n") (make-ask (make-line "Enter tax source " 'B1)))))) (make-ask (make-line "Enter IRA distributions: " '15a)) (make-ask (make-line "Enter other income: " '21)) (make-compute (make-line "Your total income is: " 22) (list '7 '15a '21) +) (make-ask (make-line "Enter IRA deductions: " '26)) (make-compute (make-line "Your adjusted gross income is: " 35) (list '22 '26) -) ) The run-command function in the interpreter gets an additional clause for the new command: ;; run-command : command -> void ;; executes a command in the tax language (define (run-command acmd) (cond [(ask? acmd) (let ([line (ask-line acmd)]) (begin (printf (line-description line) (store-line-data (line-number line) (read)))))] [(compute? acmd) (let ([line (compute-line acmd)]) (printf "~a~a~n" (line-description line) (compute-with-lines (line-number line) (line-input-lines line) (line-op line))))] [(ask-multiple? acmd) (let ([line (ask-multiple-line acmd)]) (let loop () (begin (printf (line-description line)) (let ([name (read)]) (if (not (symbol=? name 'done)) (begin (add-line-data (line-number line) name) (loop)))))))] )) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ There seems to be one more complex pattern, the one that handles the schedules: (begin (printf "Completing form 1040~n") (make-ask (make-line "Enter your name: " 'name)) (make-ask-multiple (make-line "Enter dependent name: " 'deps)) (make-ask (make-line "Enter your wages: " '7)) =============================================================================== | (printf "Enter taxable interest: ") | | (let ([ans (read)]) | | (begin | | (store-line-data '8a ans) | | (if (> ans 400) (begin (printf "Completing schedule B ~n") | | (make-ask (make-line "Enter tax source " 'B1)))))) =============================================================================== (make-ask (make-line "Enter IRA distributions: " '15a)) (make-ask (make-line "Enter other income: " '21)) (make-compute (make-line "Your total income is: " 22) (list '7 '15a '21) +) (make-ask (make-line "Enter IRA deductions: " '26)) (make-compute (make-line "Your adjusted gross income is: " 35) (list '22 '26) -) ) This is another command, one that performs a test then jumps to a schedule if necessary. The relevant data in the pattern seems to be the line number, the test, and the form to jump to. Note that the form to jump to has likely used the same patterns we've created up until now. ;; A command is a ;; - (make-ask line) ;; - (make-compute line list[symbol] (num .. num -> number)) ;; - (make-ask-multiple line) ;; - (make-schedule-test line (number -> boolean) form) (define-struct ask (line)) (define-struct compute (line input-lines op)) (define-struct ask-multiple (line)) (define-struct schedule-test (ques test then-form)) ;; A line is a (make-line string number) (define-struct line (description number)) (begin (printf "Completing form 1040~n") (make-ask (make-line "Enter your name: " 'name)) (make-ask-multiple (make-line "Enter dependent name: " 'deps)) (make-ask (make-line "Enter your wages: " '7)) (make-schedule-test (make-line "Enter taxable interest: " '8a) (lambda (ans) (> ans 400)) (begin (printf "Completing schedule B ~n") (make-ask (make-line "Enter tax source " 'B1)))) (make-ask (make-line "Enter IRA distributions: " '15a)) (make-ask (make-line "Enter other income: " '21)) (make-compute (make-line "Your total income is: " 22) (list '7 '15a '21) +) (make-ask (make-line "Enter IRA deductions: " '26)) (make-compute (make-line "Your adjusted gross income is: " 35) (list '22 '26) -) ) Extend the interpreter, as usual ;; run-command : command -> void ;; executes a command in the tax language (define (run-command acmd) (cond [(ask? acmd) (let ([line (ask-line acmd)]) (begin (printf (line-description line) (store-line-data (line-number line) (read)))))] [(compute? acmd) (let ([line (compute-line acmd)]) (printf "~a~a~n" (line-description line) (compute-with-lines (line-number line) (line-input-lines line) (line-op line))))] [(ask-multiple? acmd) (let ([line (ask-multiple-line acmd)]) (let loop () (begin (printf (line-description line)) (let ([name (read)]) (if (not (symbol=? name 'done)) (begin (add-line-data (line-number line) name) (loop)))))))] [(schedule-test? acmd) (let ([line (schedule-test-ques acmd)]) (begin (printf (line-description line)) (let ([ans (read)]) (begin (store-line-data (line-number line) ans) (if ((line-test line) ans) (schedule-test-form acmd))))))] )) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The last remaining common pattern appears to be the printing of the form title followed by the commands on that form (for both the overall form and the schedule). This is really the program, not a command, so we'll make a data definition for forms that holds a name and a list of commands: ;; a form is a (make-form string list[command]) (define-struct form (name cmdlist)) (make-form "form 1040" (list (make-ask (make-line "Enter your name: " 'name)) (make-ask-multiple (make-line "Enter dependent name: " 'deps)) (make-ask (make-line "Enter your wages: " '7)) (make-schedule-test (make-line "Enter taxable interest: " '8a) (lambda (ans) (> ans 400)) (make-form "schedule B " (list (make-ask (make-line "Enter tax source " 'B1))))) (make-ask (make-line "Enter IRA distributions: " '15a)) (make-ask (make-line "Enter other income: " '21)) (make-compute (make-line "Your total income is: " 22) (list '7 '15a '21) +) (make-ask (make-line "Enter IRA deductions: " '26)) (make-compute (make-line "Your adjusted gross income is: " 35) (list '22 '26) -) )) Now, you have the final example of the tax form in your language and the definition of the language itself: ;; A command is a ;; - (make-ask line) ;; - (make-compute line list[symbol] (num .. num -> number)) ;; - (make-ask-multiple line) ;; - (make-schedule-test line (number -> boolean) form) (define-struct ask (line)) (define-struct compute (line input-lines op)) (define-struct ask-multiple (line)) (define-struct schedule-test (ques test then-form)) ;; A line is a (make-line string number) (define-struct line (description number)) ;; a form is a (make-form string list[command]) (define-struct form (name cmdlist)) To finish the interpreter, we need to handle the new data definition. Since the edits didn't affect the command data definition, we don't edit run-command. Instead, we introduce a run-form function (to correspond to the new form data definition), and a cmdlist function (since form introduced the commandlist data type). ;; run-form : form -> void ;; prints the form title and runs the commands in a tax form (define (run-form aform) (begin (printf "Completing ~a~n" (form-name aform)) (run-cmdlist (form-cmdlist aform)))) ;; run-cmdlist : list[command] -> void ;; runs the commands in a list of commands (define (run-cmdlist clst) (cond [(empty? clst) void] [(cons? clst) (begin (run-command (first clst)) (run-cmdlist (rest clst)))])) We also need to call run-form from within the schedule test clause in run-command, since we edited the arguments to the schedule test to be a form structure. ;; run-command : command -> void ;; executes a command in the tax language (define (run-command acmd) (cond [(ask? acmd) (let ([line (ask-line acmd)]) (begin (printf (line-description line) (store-line-data (line-number line) (read)))))] [(compute? acmd) (let ([line (compute-line acmd)]) (printf "~a~a~n" (line-description line) (compute-with-lines (line-number line) (line-input-lines line) (line-op line))))] [(ask-multiple? acmd) (let ([line (ask-multiple-line acmd)]) (let loop () (begin (printf (line-description line)) (let ([name (read)]) (if (not (symbol=? name 'done)) (begin (add-line-data (line-number line) name) (loop)))))))] [(schedule-test? acmd) (let ([line (schedule-test-ques acmd)]) (begin (printf (line-description line)) (let ([ans (read)]) (begin (store-line-data (line-number line) ans) (if ((line-test line) ans) (run-form (schedule-test-form acmd)))))))] )) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Recap : Summing up, what did we do? 1. Start with a naive example that executes one tax form (slide show, etc) 2. Look for a pattern of code that gets used repeatedly (or that forms a complex yet cohesive operation) 3. Create a command for that pattern. Create new data for the pattern if you don't already have an appropriate kind of data. 4. Edit the naive program to use the new command in place of the pattern. 5. Move the replaced pattern code into the run-command function of the interpreter. 6. Repeat from step 2 until all the code has moved into a new language. You can follow these steps to derive a language and its interpreter from concrete examples of programs.