1 week with David Beazley and SICP

I had the chance to do David Beazley’s SICP course towards the end of 2022. I have only good things to say.

There are a lot of free resources out there [1]. Having Dave as a guide, I was amazed at how much we covered in a week. I loved how he cherry-picked topics for the high-level overview, but takes the time to hone in on critical sections. In some cases, Dave even comes up with his own examples to more clearly illustrate concepts being discussed.

What did I learn? It’s hard to describe. I’ll explain by analogy. When I learned about compilers, I discovered that languages have much more in common than I realized; I now see the big picture [2]. With SICP it’s a bit like that, but with the idea of computation more generally.

Let’s review a few key concepts, and then go into what I found especially mind-blowing.

Starting point

In this section, we build a simple model of computation through the process of substitution. We start with a quick review of Scheme (the language used in SICP), next build a Scheme interpreter in Python, and then take a closer look at substitution.


Scheme is a dialect of Lisp. For the course we actually used Racket, which is similar but easier to set up [3]. For our purposes we only need a small part of the language; here Scheme and Racket coincide. The > in this subsection represents the Scheme REPL.

Primitives are the simplest entities in the language. We only need integers. When evaluated we simply get it back.

> 1

For built-in operations, we need +, -, * and /. Scheme uses prefix notation, where the operator appears first and then the operands. The example here illustrates adding 2 and 3.

> (+ 2 3)

Next we have special forms. The first one we need is define, which lets us define variables. Here we define x to be 2.

> (define x 2)

> (+ x 3)

We need if for control flow, and thus include the built-ins =, > and <.

> (if (= 1 1) 2 3)

> (if (= 0 1) 2 3)

Finally we have lambda for functions. In this example, we apply to 3 an operation that multiplies a number with itself.

> ((lambda (x) (* x x)) 3)

By combining define and lambda, we can create user-defined functions. Here we bind the operation in the previous example to square.

> (define square (lambda (x) (* x x)))

> (square 3)

We present these here for use in later sections. The special form cons lets you create data pairs. The first item in the pair is accessed with car, the remainder with cdr [4].

> (define pair (cons 1 2))

> (car pair)

> (cdr pair)

Scheme interpreter in Python

How do we write a Python function that takes in Scheme source code and returns what Scheme would return? The > in this subsection represents the Python REPL.

For primitives, we return as-is.

def evaluate(expression):
    if isinstance(expression, int):
        return expression


> evaluate(1)

To keep things simple, we represent Scheme source in parentheses with Python tuples.

Now let’s do built-in operations. The first item in the tuple is the operator (as a string); we do a look up in definitions and substitute. Next we evaluate all arguments before applying the operator (in function form).

definitions = {
    "+": lambda x, y: x + y,
    "*": lambda x, y: x * y,
    "-": lambda x, y: x - y,
    "/": lambda x, y: x / y,

def evaluate(expression):
    elif isinstance(expression, str):
        return definitions[expression]

    elif isinstance(expression, tuple):
        proc = evaluate(expression[0])
        args = [evaluate(expr) for expr in expression[1:]]
        return proc(*args)

Here we illustrate the evaluation of (+ 2 3). The red highlight shows the result after each step.

> evaluate(("+", 2, 3))
# evaluate("+")(*[evaluate(expr) for expr in (2, 3)])
# definitions["+"](*[evaluate(expr) for expr in (2, 3)])
# (lambda x, y: x + y)(*[evaluate(expr) for expr in (2, 3)])
# (lambda x, y: x + y)(*[evaluate(2), evaluate(3)])
# (lambda x, y: x + y)(*[2, evaluate(3)])
# (lambda x, y: x + y)(*[2, 3)])
# 2 + 3

With define, we evaluate the expression and then store the name-result pair in definitions.

def evaluate(expression):

    elif isinstance(expression, tuple):
        if expression[0] == "define":
            definitions[expression[1]] = evaluate(expression[2])
            return None


When called, we get the value from definitions and substitute into the expression. For simplicity, we use definitions for both built-ins and define operations.

> evaluate(("define", "x", 2))
# definitions["x"] = 2

> evaluate(("+", "x", 3))
# evaluate("+")(*[evaluate(expr) for expr in ("x", 3)])
# definitions["+"](*[evaluate(expr) for expr in ("x", 3)])
# (lambda x, y: x + y)(*[evaluate(expr) for expr in ("x", 3)])
# (lambda x, y: x + y)(*[evaluate("x"), evaluate(3)])
# (lambda x, y: x + y)(*[definitions["x"], evaluate(3)])
# (lambda x, y: x + y)(*[2, evaluate(3)])
# (lambda x, y: x + y)(*[2, 3])

For if, suppose we have the form (if predicate then else). If the predicate is true, we evaluate the then clause and return the result. Otherwise evaluate the else clause and return the result.

definitions = {
    "=": lambda x, y: x == y,
    "<": lambda x, y: x < y,
    ">": lambda x, y: x > y,

def evaluate(expression):

    elif isinstance(expression, tuple):

        elif expression[0] == "if":
            if evaluate(expression[1]):
                return evaluate(expression[2])
            return evaluate(expression[3])

In this example, we return 2 when true and otherwise return 3.

> evaluate(("if", ("=", 1, 1), 2, 3))
# evaluate(2) if evaluate(("=", 1, 1)) else evaluate(3)
# evaluate(2) if True else evaluate(3)
# evaluate(2)

> evaluate(("if", ("=", 0, 1), 2, 3))
# evaluate(2) if evaluate(("=", 1, 1)) else evaluate(3)
# evaluate(2) if False else evaluate(3)
# evaluate(3)

Functions take a bit more work. The basic idea is to substitute the values of the arguments in the function body, and then evaluate the resulting expression. This process is repeated until the final result is obtained.

def evaluate(expression):

    elif isinstance(expression, tuple):

        elif expression[0] == "lambda":

            def substitute(expr, name, value):
                if expr == name:
                    return value

                elif isinstance(expr, tuple):
                    return tuple(
                        [substitute(term, name, value) for term in expr]

                return expr

            def procedure(*arguments):
                names = expression[1]
                body = expression[2]

                for i, argument in enumerate(arguments):
                    name = names[i]
                    body = substitute(body, name, argument)

                return evaluate(body)

            return procedure


In this example, applying the operation that multiplies a number with itself runs the for loop only once. This has the effect of substituting x in the function body with 3.

> evaluate((("lambda", ("x",) ("*", "x", "x")), 3))
# evaluate(("lambda", ("x",) ("*", "x", "x")))(*[evaluate(expr) for expr in (3,)])
# evaluate(("lambda", ("x",) ("*", "x", "x")))(*[3])
# evaluate(substitute(("*", "x", "x"), "x", 3))
# evaluate(("*", 3, 3))
# 3 * 3

For user-defined functions, declaring the function stores it in definitions. The substitution is deferred until the function is called with arguments. When this happens we proceed as before, obtaining the same result.

> evaluate(("define", "square", ("lambda", ("x",) ("*", "x", "x"))))
# definitions["square"] = evaluate(("lambda", ("x",) ("*", "x", "x")))

> evaluate(("square", 3))
# evaluate("square")(*[evaluate(expr) for expr in [3])
# definitions["square"](*[evaluate(expr) for expr in [3])
# evaluate(("lambda", ("x",) ("*", "x", "x")))(*[evaluate(expr) for expr in (3,)])

The substitution model

Learning compilers often involves creating a toy language. To wrap my head around how it works, I implemented a minimal version that allows Fibonacci number generation. It’s a nice use case, requiring built-in operations, control flow and recursive function calls (twice!).

I was impressed how our previous construction can be used to generate Fibonacci numbers.

fibonacci = (
    "lambda", ("n",),
        "if", ("<", "n", 2),
            ("+", ("fib", ("-", "n", 2)), ("fib", ("-", "n", 1))),

Here we use it to generate the first 10 numbers. The full code can be found here.

> evaluate(("define", "fibonacci", fibonacci))

> [evaluate(("fibonacci", i)) for i in range(10)]
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

It feels we haven’t done that much, but somehow end up with a pretty sophisticated system. We do this by only applying substitution.

Let’s phrase this in SICP terms.

In programming, we deal with two kinds of elements: procedures and data. … Informally, data is “stuff” that we want to manipulate, and procedures are descriptions of the rules for manipulating the data.

Hence the idea of substitution serves as an introductory mental model of how a procedure works. In fact, the first two chapters can be described entirely via substitution (about one-third of a ~600-page book).

SICP does emphasize how interpreters do not typically operate by means of substitution. However, this model provides a useful starting point [5].


In this section, we explore where the substitution model breaks down.

The following shorthand notation can be used to define procedures in Scheme; note how the lambda is omitted. This will be our notation going forward.

(define (square x) (* x x))


Suppose we’re modeling a bank account with a balance that changes. Scheme has the special form set! to change the value of an existing variable. We use it in the procedure withdraw that reduces the balance by the value amount and returns the updated balance.

(define balance 100)

(define (withdraw amount)
  (set! balance (- balance amount))

When we make withdrawals, the balance changes as expected. The > in this from this point onwards represents the Scheme REPL.

> balance

> (withdraw 20)

However if we simply apply the substitution model and replace balance with the value 100, withdraw will always return the initial balance.

(define (withdraw amount)
  (set! 100 (- 100 amount))

In other words, assignment breaks down the substitution model. The substitution model cannot distinguish between the balance before the set! and the balance after the set!.

To accommodate assignment, we can no longer have define serving as an alias. We need to introduce an environment where variables are defined, with corresponding get and set operators. Each procedure now has an attached environment in which they were defined. Every application of the procedure creates a new environment that holds procedure arguments.

The environment determines the context in which an expression should be evaluated, but it can be challenging to understand how multiple environments relate to each other. I distinctly recall Dave patiently walking us through different scenarios, sketching box-and-arrow diagrams along the way. Having this exposition before a closer review of the section in the book is priceless.


Let’s briefly review another model. Instead of a single change to the balance, suppose we have the sequence amounts that represents additions to or subtractions from the balance. Consider the following procedure, which gives us the whole balance history [6].

(define (withdraw balance amount)
    (withdraw (- balance (car amounts)) (cdr amounts))))

Let’s take this a step further. What if there are terms in amounts that do not yet exist, either because the event has not yet occurred or because generating the next term requires a forced evaluation? This notion of delayed evaluation (analogous to a generator in Python), turns the sequence into what’s called a stream.

When stream processing was described as a way to “model systems that have state without ever using assignment or mutable data”, all the pieces came together [7].


At the start of Chapter 3, SICP emphasizes how important it is to base the structure of our program on the structure of the system being modeled. As such, modeling a real-world bank account balance seems most natural with mutable variables.

The imperative or mutable model of programming is based on the idea of using assignments and mutable data structures to represent and manipulate state. We introduced environments to be able to handle mutable data. This is in contrast to the functional or immutable model, where pure functions are used to manipulate data without changing the state of the system [8].

It is worth noting that the imperative model introduces side effects. This means that the order of operations now matter, as we need to ensure every statement is using the correct version of each variable [9]. Managing this complexity gets worse when we have concurrency [10].

The other tricky issue has to do with the concept of equality. Are two objects equal when they have the same value? Are they equal if if they have the same memory address? Do we need multiple equality operators? Different languages handle this differently, making different trade-offs along the way.

The end of Chapter 3 says this best.

We can model the world as a collection of separate, time-bound interacting objects with state, or we can model the worlds as a single, timeless, stateless unity. Each view has powerful advantages, but neither alone is completely satisfactory. A grand unification has yet to emerge.

Mind blown

Now for the fun parts…

Lambda calculus

Consider the following procedures that take in other procedures as arguments [11]. In particular, zero maps any procedure to the identity procedure.

(define (identity x) x)

(define (zero f) identity)

(define (increment n)
  (lambda (f)
    (lambda (x) (f ((n f) x)))))

We apply zero and (increment zero) to a procedure that adds 1.

> ((zero (lambda (x) (+ x 1))) 0)

> (((increment zero) (lambda (x) (+ x 1))) 0)

Essentially what we have are integers represented as procedures. That’s right. Numbers as procedures. Take a moment to let that sink in…

This representation is known as Church numerals, after Alonzo Church.

We define one as a procedure that takes a procedure argument and applies it once. We can also arrive at this result by expanding out (increment zero). Recall that zero maps a procedure f to identity, and identity maps x to itself.

(define (one f)
  (lambda (x) (f x)))

; (increment zero)
; (lambda (f) (lambda (x) (f ((zero f) x))))
; (lambda (f) (lambda (x) (f (identity x))))
; (lambda (f) (lambda (x) (f x)))

Now the encore. If one applies a procedure argument once, we expect two to apply it twice. Expanding out (increment one) confirms this is indeed the case.

(define (two f)
  (lambda (x) (f (f x))))

; (increment one)
; (lambda (f) (lambda (x) (f ((one f) x))))
; (lambda (f) (lambda (x) (f (f x))))

We apply one and two to the same procedure above. It’s magic.

> ((one (lambda (x) (+ x 1))) 0)

> ((two (lambda (x) (+ x 1))) 0)

In our Scheme interpreter above, we handled the special forms define, if and lambda. We need lambda to create procedures, so that’s a keeper. We can sort of do without define. That leaves us with if, or does it?

(define (true x)
  (lambda (y) x))

(define (false x)
  (lambda (y) y))

(define (if f)
  (lambda (a)
    (lambda (b) ((f a) b))))

Yes, it does work. The interpreter needs a tweak; tests here will now pass.

> (((if true) 2) 3)

> (((if false) 2) 3)

Chapter 2 is dedicated to manipulating complex forms with cons, car and cdr. In particular, compound data structures such as lists and dictionaries can be built on top of pairs.

It turns out you can also represent these with procedures.

(define (cons x y)
  (lambda (m) (m x y)))

(define (car z)
  (z (lambda (p q) p)))

(define (cdr z)
  (z (lambda (p q) q)))

Here we go.

> (car (cons 1 2))

> (cdr (cons 1 2))

Let’s circle back to the idea that the building blocks of programming are data and procedures. Based on our excursion here, we can say the following.

All you need is procedures.

Theoretically we can now write Fibonacci using only procedures. Perhaps for another time…

Applicative vs normal order

There are lots of subtle-yet-powerful ideas in SICP. As per Dave.

Major parts of the book are devoted to setting up competing approaches to computation. Consequences and tradeoffs of these decisions are then explored. Often, the differences are subtle, but insightful.

Scheme uses applicative-order evaluation to evaluate procedures, which means arguments are fully evaluated before the procedure is applied. This strategy is often referred to as eager evaluation or call-by-value. In this example, (+ 1 2) is evaluated first.

> (square (+ 1 2))
; (square 3)
; (* 3 3)

This is how most programming languages work, and is how we constructed our interpreter.

In contrast, normal-order evaluation is when the evaluation of arguments is deferred to when they are actually needed. This is often referred to as lazy evaluation or call-by-need.

> (square (+ 1 2))
; (* (+ 1 2) (+ 1 2))
; (* 3 (+ 1 2))
; (* 3 3)

Applicative order is simpler to implement and understand. It can also be more efficient; in our example (+ 1 2) is evaluated once. That said there are cases where applicative order leads to unnecessary evaluations, while normal order would skip those steps. Normal order does have extra overhead; it requires tracking which parts have been evaluated and which have not.

At first glance, it may appear that we get the same answer either way. Like lambda calculus, this is an insight-bomb hiding in an exercise.

> (define (p) (p))

> (define (test x y) (if (= x 0)) 0 y)

> (test 0 (p))
; spins forever

Running the last line, we go into an infinite loop using applicative order. In normal order, we simply return 0 [12].

Recursion vs iteration

Scheme does not have for loops. Instead we do repeated operations through recursion, in which the procedure calls itself. The example here calculates factorials.

(define (factorial n)
  (if (= n 1) 
    (* n (factorial (- n 1)))))

A recursive process always leaves some part of the computation behind to be completed later. This results in a chain of pending operations that grows until we hit the base case, at which point the chain starts to collapse.

> (factorial 6)
; (* 6 (factorial 5))
; (* 6 (* 5 (factorial 4)))
; (* 6 (* 5 (* 4 (factorial 3)))
; (* 6 (* 5 (* 4 (* 3 (factorial 2)))
; (* 6 (* 5 (* 4 (* 3 (* 2 (factorial 1)))
; (* 6 (* 5 (* 4 (* 3 (* 2 1))
; (* 6 (* 5 (* 4 (* 3 2))
; (* 6 (* 5 (* 4 6))
; (* 6 (* 5 24)
; (* 6 120)

To avoid this growth-and-collapse pattern, we can use what SICP calls iteration. The idea is we keep track of a counter and a result, which are ‘carried forward’ on each procedure call. The calculation returns the result when the stopping point is reached.

(define (factorial n)
  (define (fact-iter product counter max-count)
    (if (> counter max-count)
      (fact-iter (* counter product) (+ counter 1) max-count)))
  (fact-iter 1 1 n))

The subprocedure (or closure) fact-iter is called by factorial, and returns when counter is larger than 6.

> (factorial 6)
; (fact-iter   1 1 6)
; (fact-iter   1 2 6)
; (fact-iter   2 3 6)
; (fact-iter   6 4 6)
; (fact-iter  24 5 6)
; (fact-iter 120 6 6)
; (fact-iter 720 7 6)

As an aside, it’s interesting to note that these approaches can be related to problem-solving techniques. Recursion is ‘breaking down large problems into smaller ones’, iteration is ‘start small and gradually build up to a result’.

From a space perspective, recursion is as efficient as for loops. In the context of stack frames, however, repeated function calls would use up increasing amounts of memory. To get around this, Scheme takes advantage of a technique called tail-call optimization. This eliminates the memory overhead by implementing what is practically a for loop under-the-hood.

Peter Norvig even implements tail call optimization into his interpreter here.


OK so Dave cherry-picked topics for a week-long course; I cherry-picked from those topics for a blog post. What did I leave out? A fair bit. Metacircular evaluator. Repeated procedure application to calculate square roots. Discussions on classes, functional programming, macros, types, symbolic manipulation…

It’s unclear when I’ll get to writing a follow-up post; I’m already on to Dave’s other courses. That’s right, I’m all-in. In the meantime, Chelsea Troy has a more comprehensive write-up here.

Is the material too theoretical for software engineers? Maybe. I will, however, leave you with this quote by Donald Knuth.

If you find that you’re spending almost all your time on theory, start turning some attention to practical things; it will improve your theories. If you find that you’re spending almost all your time on practice, start turning some attention to theoretical things; it will improve your practice.

[1] The SICP book can be found here, recorded lectures here.

[2] Differences include static vs dynamic types, compiled vs interpreted, with vs without garbage collection. The course I did started off with this wonderful quote by Ras Bodik.

Take historical note of textile and steel industries: do you want to build machines and tools, or do you want to operate those machines?

[3] The name Racket is a playful reference to Scheme, in the sense that the word ‘scheme’ means a plan that is often dishonest or illegal.

[4] Clojure uses first and rest as more intuitively-named versions of car and cdr.

[5] This is described in SICP here.

The purpose of substitution is to help us think about procedure application, not to provide a description of how the interpreter really works. Typical interpreters do not evaluate procedure applications by manipulating the text of a procedure to substitute values for the formal parameters. … The substitution model is only the first of these models — a way to get started thinking formally about the evaluation process.

[6] For convenience, we did not distinguish between the non-stream and stream versions of car and cdr.

[7] This phrasing came out of a different SICP session, details here.

[8] Bob Nystrom has an amusing paragraph in Crafting Interpreters on the opposing camps.

Mutating a variable is a side effect and, as the name suggests, some language folks think side effects are dirty or inelegant. Code should be pure math that produces values —crystalline, unchanging ones — like an act of divine creation. Not some grubby automaton that beats blobs of data into shape, one imperative grunt at a time.

[9] Exercise 3.8 involves coming up with a procedure f such that evaluating (+ (f 0) (f 1)) returns 0 but (+ (f 1) (f 0)) returns 1.

[10] Bartosz Milewski calls for a return to functional programming in the multicore world, in the preface of his book here.

[11] This subsection is actually set up as Exercise 2.6 in the book.

[12] Another way to think about special forms is as procedures where ‘normal rules do not apply’. In the case of if, the else clause can lead to an infinite loop if evaluated but is short-circuited to the then clause when the predicate is true.