8 minutes
I Made a Lisp!
Note: this was copied from my GitHub. The original is here.
I Made a Lisp!
And you can too!
I caught a cold one weekend so I couldn’t leave the house. I used that time to
build a Lisp interpreter! This interpreter implements lambdas, mutable
variables, mutable pairs, closures, macros, and the quote and quasiquote reader
macros. Please see the examples folder or
prelude.lisp
for demonstrations of this Lisp’s features.
Below, I detail the pieces that go into creating a Lisp, with simplified code samples for various parts of the interpreter. I also detail which resources I used while making this Lisp.
The Pieces
Writing an interpreter consists of three parts:
- A reader that takes source code as input and produces an AST (abstract syntax tree) as output.
- An evaluator that takes an AST as input and produces an expression as output, computing side-effects as well.
- A printer that takes an expression as input and prints a human-readable representation.
These three parts are present in every interpreter that I’m aware of. Composing these three functions in a loop is known as a REPL, or Read-Eval-Print Loop. The Go code might look something like the following:
for {
Print(Eval(Read()))
}
While the printing step is pretty trivial, I’ll cover every other step in this section.
Parser
Here’s an example of a Lisp expression. This expression evaluates to the number 30.
(multiply 10 (add 1 2))
In this expression, multiply
and add
are symbols, 10
, 1
, and 2
are
numbers, and everything between a pair of parentheses is a list.
Lisp’s syntax makes it trivially easy to parse. The name Lisp (sometimes stylized LISP) is an abbreviation for “LISt Processor”, so every expression is written as a list. As such, the parsing strategy is trivial.
Our Read
function is very straightforward. It looks something like this:
func Read() Expr {
ConsumeWhitespace()
if e := ReadList(); e != nil {
return e
}
if e := ReadNumber(); e != nil {
return e
}
if e := ReadSymbol(); e != nil {
return e
}
panic("failed to parse")
}
Reading a number or symbol is trivial, and the list parsing function looks something like this:
// peekRune gets the next character without consuming it
// readRune gets the next character and consumes it
func ReadList() Expr {
if peekRune() != '(' {
return nil
}
readRune()
result := makeList()
for {
if readRune() == ')' {
return result
}
result.Append(Read())
}
}
This is basically all that you need to parse Lisp. The Read
function reads any
expression, calling functions like ReadList
, ReadSymbol
, and ReadNumber
,
and the ReadList
expression calls Read
for each element of the list.
This parsing strategy is called a “recursive descent” parser, and it works
because the parser doesn’t need to backtrack. Functions like peekRune
and
readRune
can operate on a stream with a fixed input buffer, since you don’t
need to peek an arbitrary number of characters. This parsing strategy allows for
a very simple parser to be constructed without using any sort of library.
Evaluator
You’ll notice that the return type of Read
was an Expr
. This is because, in
Lisp, there is no distinction between an AST and data. Since the AST is written
as a list, we can internally treat it just like Lisp’s list data structure. This
will allow us to do some cool things like macros (explained below).
Our evaluator function take two parameters, and expression and an environment, and will return an expression.
An environment is the “scope” of our evaluation. The environment is simply a mapping between variable names and values, and will be used to look up variables.
Our Eval
function looks something like this:
func Eval(expr Expr, env *Env) Expr {
switch expr := expr.(type) {
case *Number:
// return primitive data as itself
return expr
case *Symbol:
// look up variable and return its value
return env.Resolve(expr)
case *List:
// apply the function to its arguments
return Apply(Eval(expr.First(), env), expr.Rest(), env)
default:
panic("unknown expression type evaluated")
}
}
So we have three cases:
- If it’s primitive data, it evaluates to itself.
- If it’s a variable, look up the corresponding value. In our language, we
represent variables with a
Symbol
type that stores names (like the functionadd
or the numbern
). - If it’s a list, apply the function to the arguments. The function is either
an anonymous function literal or a variable, so just call
Eval
on it to get the resulting value.
Apply
But what does Apply
do? I lied when I said it only takes functions (which I’ll
call prodecures from here on out). It actually takes one of three things:
procedures, macros, and special forms.
Our Apply
function looks something like this:
func Apply(f Expr, args *List, env *Env) Expr {
switch f := f.(type) {
case *Procedure:
ApplyProcedure(f, args, env)
case *Macro:
ApplyMacro(f, args, env)
case SpecialForm:
f(args, env)
}
}
I’ll cover each of the three cases below.
Procedures
If you’ve programmed before, procedures (functions) should be familiar to you. A procedure is made up of a few things:
- The environment (scope) that it was defined in (i.e. defined at the top level or defined within some function as a closure)
- Its arguments (as a list)
- Its body (as an expression with the code as an AST)
Our ApplyProcedure
function looks something like this:
func ApplyProcedure(proc *Procedure, args *List, env *Env) {
// evaluate the arguments
evaluatedArgs := args.Map(func (expr Expr) Expr {
return Eval(expr, env)
})
// create a new environment and bind the arguments to their values
newEnv := makeEnv()
newEnv.SetParent(proc.ParentEnv)
for i, argSymbol := range proc.Args {
newEnv.Bind(argSymbol, evaluatedArgs.Index(i))
}
// evaluate the body with the new environment
return Eval(proc.Body, newEnv)
}
We use each of the three parts of a procedure:
- The environment was used as the parent of our new environment. The function body will have access to the same variables as where it was defined. The new environment allows the body to define new variables without polluting the outer environment (we don’t want all variables to be global).
- The arguments were used to bind the passed in argument values with the corresponding variable names.
- The body was evaluated in this new environment.
Macros
Macro are very similar to functions but their purpose is different. While functions take expressions as arguments and return an expression, macros take ASTs as arguments and return a new AST. This allows you to extend the language with new syntax.
Here’s an example of what a macro might look like:
;; Example:
;; The expression (subtract 10 3)
;; Evaluates to: 7
;; The expression (reverse-args (subtract 10 3))
;; Expands to: (subtract 3 10)
;; Evaluates to: -7
(defmacro reverse-args (expr)
(append ; append two lists
(list (first expr)) ; the first element is the procedure/macro/special form
(reverse (rest (expr))))) ; the rest of the elements are the reversed arguments
Notice how the argument didn’t evaluate to 7 before being passed in, but was
instead passed in as a list. When the resulting list was returned, only then was
it evaluated. This is how macros work, and it allows us to define new syntax
such as
let
for variable binding.
Anyways, now that I’m done motivating macros, let’s discuss how to implement them.
Our ApplyMacro
function looks something like this:
func ApplyMacro(macro *Macro, args *List, env *Env) {
// create a new environment and bind the arguments to their values
newEnv := makeEnv()
newEnv.SetParent(proc.ParentEnv)
for i, argSymbol := range proc.Args {
newEnv.Bind(argSymbol, args.Index(i))
}
// evaluate the body with the new environment
newAST := Eval(proc.Body, newEnv)
// finally evaluate the resulting AST in the current environment
return Eval(newAST, env)
}
There are two differences from ApplyMacro
:
- The arguments are not evaluated before they are bound
- The returned value is evaluate (since its an AST)
We just moved the evaluation from before application to after. While macros are very simple, they are very powerful since they can be used to essentially create new syntax for the language. The n-queens algorithm example in this repo provides some examples of macros used in a practical context, as well as the prelude where I define some new syntax using Lisp rather than Go.
Special Forms
Finally, special forms are simply special syntax implemented in Go. Because of this, they can simply be called as functions. We can define special forms like so:
type SpecialForm func (*List, *Env) Expr
Here’s an example of a define special form:
// type checking omitted and error handling omitted
func Define(args *List, env *Env) Expr {
name := args.Index(0).(*Symbol)
value := Eval(args.Index(1), env)
env.Bind(name, value)
return value
}
Which can be used like so:
(define x 3)
;; => 3
(define y (+ 1 (* x x)))
;; => 10
(* 2 (+ x y))
;; => 26
By implementing special forms in Go, we get access to the internals of our
evaluator, for example by allowing Define
to bind new values in the
environment. A Lisp programmer never sees the environment and likely never even
thinks about it if they already expect lexical scoping from their programming
languages.
Resources
Here is a collection of resources that I used while making this Lisp:
- Structure and Interpretation of Computer Programs: This book is where I was first introduced to Lisp and where I learned about its syntax, the Environment model, and how a Lisp interpreter is put together.
- MiniLisp by Rui Ueyama. A Lisp interpreter in less than 1000 lines of C. This is the first Lisp interpreter that I read the source code for.
- SectorLisp by Justine Tunney. A Lisp interpreter that assembles to less than 512 bytes. This project broke down Lisp into its absolutely smallest necessary pieces and allowed me to understand its fundamentals.
- The R5RS Scheme Standard which is the most widely implemented standard for Scheme. I used this as a reference whenever I was unsure how something should behave. For simplicity’s sake my implementation doesn’t follow this spec, but it did help guide my design decisions.
1669 Words
2022-04-18 00:00 +0000