Chapter 1 — Building Abstractions with Procedures

1. conjure v.

conjure v. / 'kʌndʒə(r) /

~ / ~ sth: make sth appear, disappear, or change as if by magic
- Her grandfather taught her to conjure. 她的祖父教她变魔术。
- He could conjure coins from behind people's ears. 他可以从人们的耳朵后面变出硬币来。
- The magician conjured a rabbit out of his hat.
- Thirteen years ago she found herself having to conjure a career from thin air. 13年前，她认识到自己得白手起家创造出一番事业来。
- They managed to conjure a victory. 他们出人意料地取得了胜利。
- He has conjured victories from worse situations than this.
- He conjured a delicious meal out of a few leftovers. 他居然用几样吃剩的东西做出了可口的一餐。
- ~ up sth
  - Every day a different chef will be conjuring up delicious dishes in the restaurant. 每天，饭店里会有一位不同的大厨像变戏法似的奉上可口的菜肴。
  - He conjured up a smile and reached out to squeeze her hand. 他马上露出笑脸，伸手去捏她的手。
  - Dieting always seems to conjure up images of endless salads.
  - Somehow we have to conjure up another $10,000.
  - Anne conjured up a most delicious home-made hot pot. 安妮魔术般地变出了一壶烫好的极醇美的自酿酒。
recall (an image); cause to feel or think of sth
- she had forgotten how to conjure up the image of her mother's face. 她已想不起她母亲的脸长得什么样了。
- a special tune that conjures up a particular time and place. 令人想起特别时刻及场合的专用曲调。
call upon (a spirit or ghost) to appear by means of a magic ritual. 施魔法召唤（神灵，鬼魂）。
- they hoped to conjure up the spirit of their dead friend. 他们希望能施魔法召来已逝朋友的灵魂。

2. Applicative order vs. normal order

In applicative order, the interpreter first evaluates the operator & operands and then applies the resulting procedure to the resulting arguments. But in normal order, the interpreter would not evaluate the operands until their values were needed. Moreover, the value is evaluated every time it is needed, instead of once on entrance.

For example, given some procedures:

(define (square x)
  (* x x))
(define (sum-of-squares a b)
  (+ (square a) (square b)))
(define (f a)
  (sum-of-squares (+ a 1) (* a 2)))

and evaluate the expression (f 5). In applicative order, the interpreter "evaluate the arguments and then apply":

(f 5)                             ; replace in body: a <- 5
(sum-of-squares (+ 5 1) (* 5 2))  ; evaluate operator & operands
(sum-of-squares 6 10)             ; replace in body: a <- 6, b <- 10
(+ (square 6) (square 10))        ; evaluate operator & operands
(+ (* 6 6) (* 10 10))
(+ 36 100)
136

In normal order, the interpreter "fully expand and then reduce":

(f 5)
(sum-of-squares (+ 5 1) (* 5 2))
(+ (square (+ 5 1)) (square (* 5 2)))
(+ (* (+ 5 1) (+ 5 1)) (* (* 5 2) (* 5 2))) ; fully expanded, start reducing
(+ (* 6 6) (* 10 10))
(+ 36 100)
136

In applicative order, each operator (and operand) is evaluated only once. The first operand of sum-of-squares ((+ 5 1)) is first evaluated to 6 and passed into the procedure body. However, in normal order, that operand is directly passed into the body of sum-of-squares, and duplicated in the body of square. As a result, (+ 5 1) is evaluated twice in normal order.

Although normal order evaluation may introduce duplicated evaluation, its "evaluate until needed" nature allows skipping the evaluation of some of the arguments, for example:

(define (ite condition then else)       ; if-then-else
  (if condition then else))

(ite (> x y) (heavy-computation) 0)

Even though ite is only a normal function (not a special form), when the condition (> x y) is not met, the (heavy-computation) is not evaluated and 0 is returned. For more information on different kinds of evaluation strategies, see this wikipedia article.

The article Normal, Applicative and Lazy Evaluation contains a more formal definition of normal-order evaluation.

3. Functions vs special forms

3.1. `define`

3.2. `if` & `cond`

In Scheme, #f is interpreted as false, and #t (or any other value other than #f) is true.

(> 1 2)                                 ; #f
(< 1 2)                                 ; #t

(define (abs x)
  (cond [(> x 0) x]
        [(= x 0) 0]
        [(< x 0) (- x)]))
(define (abs x)
  (cond [(< x 0) (- x)]
        [else x]))

else is a special symbol that can be used as the final predicate of cond. In fact, any value other than #f can be used in place of else.

3.3. `and` & `or`

and and or are special forms, as not all operands are necessarily evaluated. However, not is an ordinary procedure, as it only takes and evaluates one operand.

and returns the value of the first expression that evaluates to a false value, or the value of the last expression, if all expressions evaluate to true values.

(and (= 2 2) (> 2 1))                   ; #t
(and (= 2 2) (< 2 1))                   ; #f
(and 1 2 'c '(f g))                     ; (f g)
(and)                                   ; #t

Similarly, or returns the first expression that evaluate to a true value, or the value of the last expression (#f), if all expressions evaluate to false values.

(or (= 2 2) (> 2 1))                    ; #t
(or (= 2 2) (< 2 1))                    ; #t
(or #f #f #f)                           ; #f
(or 123 (/ 3 0))                        ; 123

Note that (/ 3 0) is not evaluated.

4. Find the smallest divisor

(define (smallest-divisor n)         ; find the smallest divisior of n
  (define (find-divisior n test-divisior)
    (cond [(> (square test-divisior) n) n]
          [(divides? test-divisior n) test-divisior]
          [else (find-divisior n (+ test-divisior 1))]))
  (define (divides? a b)
    (= (remainder b a) 0))
  (find-divisior n 2))

5. Accumulate, sum, prod

accumulate takes an initial value (null-value) and a way to combine the running total with the new term (combiner).

;;; recursive
(define (accumulate combiner null-value term a next b)
  (if (> a b)
      null-value
      (combiner (accumulate combiner
                            null-value
                            term
                            (next a)
                            next
                            b)
                (term a))))
;;; iterative
(define (accumulate combiner null-value term a next b)
  (define (iter a total)
    (if (> a b)
        total
        (iter (next a) (combiner total (term a)))))
  (iter a null-value))

Both sum and prod can be defined in terms of accumulate.

(define (sum term a next b)
  (accumulate + 0 term a next b))

(define (prod term a next b)
  (accumulate * 1 term a next b))

(define (sum-cubes a b)
  (sum cube a 1+ b))

(define (sum-integers a b)
  (sum identity a 1+ b))

(define (pi-sum a b)
  (sum (lambda (x) (/ 1.0 (* x (+ x 2))))
       a
       (lambda (x) (+ x 4))
       b))

(define (fact n)
  (prod (lambda (x) x)
        1
        1+
        n))

6. Local variables

Functions take parameters, which can be used as local variables. Take for example the function: \[ f(x, y) = x(1+xy)^2 + y(1-y) + (1+xy)(1-y). \] Let $a = (1+xy)$, $b = (1-y)$, so $f(x, y) = x a^2 + y b + a b$.

(define (f x y)
  (define (f-helper a b)           ; use parameters as local variables
    (+ (* x (square a))
       (* y b)
       (* a b)))
  (f-helper (+ 1 (* x y))          ; a = 1 + xy
            (- 1 y)))              ; b = 1 - y

The helper function is called only once, so it can be replaced with a lambda expression:

(define (f x y)
  ((lambda (a b)     ; use lambda expression instead of named functions
     (+ (* x (square a))
        (* y b)
        (* a b)))
   (+ 1 (* x y))     ; a = 1 + xy
   (- 1 y)))         ; b = 1 - y

This is equivalent to using the let special form:

(define (f x y)
  (let ((a (+ 1 (* x y)))
        (b (- 1 y)))
    (+ (* x (square a))
       (* y b)
       (* a b))))

Another way to introduce local variable is using define.

Local variables can be implemented as function parameters.

No new mechanism is required in the interpreter in order to provide local variables. A let expression is simply syntactic sugar for the underlying lambda application.

Since let is only syntactic sugar, the local variables are calculated in the same way as function parameters, meaning:

They are computed in parallel, not in sequence. The expression
```
(let ([a 10]
      [b (+ a a)])
  b)
```
results in error "Unbound variable: a". b cannot use the value of the preceding variable (parameter) a.
The symbols used in their computation are from the outer scope. As a result, the expression
```
(define x 2)                            ; [1]
(let ([x 3]                             ; [2]
      [y (+ x 2)])
  (* x y))
```
has 12 as the result. The value of y is computed using the global variable x in [1] (outer scope), not [2].

7. Fixed-point & Newton's method

7.1. Fixed-point

A number $x$ is called a fixed point of a function $f$ if $f(x) = x$. For some function $f$ we can locate a fixed point by beginning with an initial guess and applying $f$ repeatedly, \[ f(x), \quad f(f(x)), \quad f(f(f(x))), \quad \ldots \] until the value does not change very much.

(define (fixed-point f initial-guess)
  (define tolerance 0.001)
  (define (close-enough? a b)
    (< [abs (- a b)] tolerance))
  (define (try guess)
    (let ([next (f guess)])
      (if (close-enough? guess next)
          next
          (try next))))
  (try initial-guess))

To find $\sqrt{x}$ means finding the fixed point of the function $f(y) = x/y$. However, consider an initial guess $y_1$. The next guess is $y_2 = f(y_1) = x / y_1$, and the next one $y_3 = f(y_2) = x / (x / y_1) = y_1$. The guesses will oscillate between $y_1$ and $y_2$, never converging.

Applying the technique of average damping can solve this problem. Here average-damp is a procedure that takes a procedure f and returns another procedure—the average damped version of f.

(define (average x y) (/ (+ x y) 2))

(define (average-damp f)
  (lambda (x) (average x (f x))))

(define (sqrt x)
  (fixed-point (average-damp (lambda (y) (/ x y)))
               1.0))

(sqrt 9)                                ; 3.000000001396984

Notice that cube root is the fixed point of the function $f(y) = x / y^2$:

(define (cube-root x)
  (fixed-point (average-damp (lambda (y) (/ x (square y))))
               1.0))

(cube-root 27)                          ; 2.9998228753561564

7.2. Newton's method

If $g(x)$ is a differentiable function, then a solution of $g(x)=0$ is a fixed point of the function $f(x)$, where \[ f(x) = x - \frac{g(x)}{g'(x)}. \]

First we expression the idea of a derivative: \[ g'(x) = \frac{g(x + dx) - g(x)}{dx}. \] Just like average damping, deriv transforms a function into another function:

(define (deriv g)
  (define dx 0.001)
  (lambda (x)
    (/ (- (g (+ x dx))
          (g x))
       dx)))

With the aid of deriv, we can express Newton's method as a fixed-point process. Here newton-transform converts the problem of finding $g(x) = 0$ to finding $f(x) = x$.

(define (newton-transform g)
  (lambda (x)
    (- x (/ (g x)
            ((deriv g) x)))))

(define (newtons-method g guess)
  (fixed-point (newton-transform g) guess))

Thus we can calculate $\sqrt{x}$:

(define (sqrt x)
  (newtons-method (lambda (y) (- (square y) x))
                  1.0))

(sqrt 9)                                ; 3.0000000174227237

Note that the resulting lambda expression in newton-transform calculates the derivative of $g$ every time it is called, since it does not save the result of (deriv g). This is very inefficient. Using a local variable dg to hold the result so deriv is called only once:

(define (newton-transform g)
  (let ([dg (deriv g)])
    (lambda (x)
      (- x (/ (g x)
              (dg x))))))

7.3. `fixed-point-of-transform`

We calculated sqrt using both the fixed point search and Newton's method:

;;; fixed point
(define (sqrt x)                        ; [1]
  (fixed-point (average-damp (lambda (y) (/ x y)))
               1.0))
;;; Newton's method
(define (sqrt x)                        ; [2]
  (newtons-method (lambda (y) (- (square y) x))
                  1.0))

The latter [2] expands to:

(define (sqrt x)                        ; [3]
  (fixed-point (newton-transform (lambda (y) (- (square y) x)))
               1.0))

Both [1] and [3] have the same pattern—each method begins with a function and finds a fixed point of some transformation of the function (average-damp or newton-transform). We can express this general idea itself as a procedure:

(define (fixed-point-of-transform g transform guess)
  (fixed-point (transform g)
               guess))

Then the two methods become:

(define (sqrt x)
  (fixed-point-of-transform (lambda (y) (/ x y))
                            average-damp
                            1.0))

(define (sqrt x)
  (fixed-point-of-transform (lambda (y) (- (square y) x))
                            newton-transform
                            1.0))

8. Compose

Let $f$ and $g$ be two one-argument functions. The composition $f$ after $g$ is $f(g(x))$:

(define (compose f g)
  (lambda (x) (f (g x))))

((compose square 1+) 6)                 ; => (square (1+ 6)) => 49

Applying a function $f$ $n$ times yields \[ f(f(\cdots f(x) \cdots)). \] We can either return $f$ when $n=1$, or return an identity function when $n=0$. The latter produces the correct result even when $n=0$.

(define (repeated f n)
  (if (= n 1)
      f
      (compose f
               (repeated f (- n 1)))))

(define (repeated f n)
  (if (= n 0)
      identity
      (compose f
               (repeated f (- n 1)))))

((repeated 1+ 10) 5)                    ; 15

Alternatively, there's an iterative implementation:

(define (repeated f n)
  (define (iter n res)
    (if (= n 0)
        res
        (iter (- n 1) (compose f res))))
  (iter n identity))

((repeated 1+ 10) 5)                    ; 15

9. `lambda` for recursion

How to write a recursive function using only lambda? The main problem, of course, is how can a lambda expression call itself when it doesn't have a name for itself?

Section 3.2 of The Scheme Programming Language gives the answer: simply pass the lambda procedure to itself:

(let ([sum (lambda (sum l)
             (if (null? l)
                 0
                 (+ (car l) (sum sum (cdr l)))))])
  (sum sum '(1 2 3 4)))                 ; 10

The let expression is essentially another lambda, here we give it a better name:

((lambda (sum)
   (sum sum '(1 2 3 4)))
 (lambda (self l)
   (if (null? l)
       0
       (+ (car l) (self self (cdr l)))))) ; 10

Here is a factorial using two lambdas, only slight difference:

((lambda (f x)
   (f f x))
 (lambda (self n)
   (if (= n 0)
       1
       (* n (self self (- n 1)))))
 5)                                     ; 120

This stack overflow question uses three lambdas. The answers below has an explanation covering Y combinator.

(((lambda (x) (x x))                    ; [1]
  (lambda (fact-gen)                    ; [2]
    (lambda (n)                         ; [3]
      (if (zero? n)
          1
          (* n ((fact-gen fact-gen) (- n 1)))))))
 5)                                     ; 120

[3] is the factorial function. If [3] were given the name fact, then (fact-gen fact-gen) is just fact itself. [2] is a generator function whose parameter (fact-gen) is also a generator function (so [2] can use itself as parameter) and returns the factorial function. [1] takes a generator function ([2]) and applies the function to itself, thereby obtaining as return value the factorial function.

This answer uses named let:

((lambda (n)
   (let sub ((i n) (z 1))
     (if (zero? i)
         z
         (sub (- i 1) (* z i)) )))
 5 )                                    ; 120

10. Exercises

10.1. Ex 1.3 — the smallest of the three

Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.

When looking for the smallest value, the predicate smaller or equal to (<=) must be used. If only < is used, in evaluating (f 2 2 3), the first two and condition will evaluate to false. The result would be (sum-of-squares 2 2), which is very wrong.

(define (sum-of-squares a b)
  (+ (* a a) (* b b)))

(define (f a b c)
  (cond [(and (<= a b) (<= a c)) (sum-of-squares b c)]
        [(and (<= b a) (<= b c)) (sum-of-squares a c)]
        [else                    (sum-of-squares a b)]))

(f 2 2 3)                               ; 13

In order to find the two larger ones out of three, a simpler solution:

(define (f a b c)
  (sum-of-squares (max a b)
                  (max (min a b) c)))

For the first two numbers (a, b), at least one of them is in the result. So the bigger one ((max a b)) must be in the result. As for the smaller one ((min a b)), it needs to be compared with c.

10.2. Ex 1.5 — applicative-order & normal-order

Ben Bitdiddle has invented a test to determine whether the interpreter he is faced with is using applicative-order evaluation or normal-order evaluation. He defines the following two procedures:
(define (p) (p))

(define (test x y)
  (if (= x 0)
      0
      y))
Then he evaluates the expression
(test 0 (p))
What behavior will Ben observe with an interpreter that uses applicative-order evaluation? What behavior will he observe with an interpreter that uses normal-order evaluation? Explain your answer. (Assume that the evaluation rule for the special form if is the same whether the interpreter is using normal or applicative order: The predicate expression is evaluated first, and the result determines whether to evaluate the consequent or the alternative expression.)

Using the substitution model, (p) infinitely expands to itself. Evaluating (p) will lead to an endless recursion.

In applicative-order evaluation, the interpreter first evaluates all its operands, including (p). So the whole expression will not evaluate to any result.

However, in normal-order evaluation, not all operands will necessarily be evaluated (not until they are actually needed). The expression is first expanded into (if (= 0 0) 0 (p)). Since the predicate is true, the (p) on the false branch is never needed. The whole expression evaluates to 0.

10.3. Ex 1.16 — iterative fast exponentiation

Design a procedure that evolves an iterative exponentiation process that uses successive squaring and uses a logarithmic number of steps, as does fast-expt. (Hint: Using the observation that $(b^{n/2})^2 = (b^2)^{n/2}$, keep, along with the exponent $n$ and the base $b$, an additional state variable $a$, and define the state transformation in such a way that the product $a b^n$ is unchanged from state to state. At the beginning of the process a is taken to be $1$, and the answer is given by the value of $a$ at the end of the process. In general, the technique of defining an invariant quantity that remains unchanged from state to state is a powerful way to think about the design of iterative algorithms.)

Original recursive code to compute $b^n$:

(define (fast-expt b n)
  (cond [(= n 0) 1]
        [(even? n) (square (fast-expt b (/ n 2)))]
        [else (* b (fast-expt b (- n 1)))]))

Iterative code:

(define (fast-expt b n)
  (define (iter b n prod)
    (cond [(= n 0) prod]
          [(even? n) (iter (square b) (/ n 2) prod)]
          [else (iter b (- n 1) (* prod b))]))
  (iter b n 1))
;; the same thing:
(define (fast-expt b n)
  (define (iter a b n)                  ; a * b^n
    (cond [(= n 0) a]
          [(even? n) (iter a (square b) (/ n 2))]
          [else (iter (* a b) b (- n 1))]))
  (iter 1 b n))

10.4. Ex 1.44 — order of application

The idea of smoothing a function is an important concept in signal processing. If $f$ is a function and $dx$ is some small number, then the smoothed version of $f$ is the function whose value at a point $x$ is the average of $f(x-dx)$, $f(x)$, and $f(x+dx)$. Write a procedure smooth that takes as input a procedure that computes $f$ and returns a procedure that computes the smoothed $f$. It is sometimes valuable to repeatedly smooth a function (that is, smooth the smoothed function, and so on) to obtain the n-fold smoothed function. Show how to generate the n-fold smoothed function of any given function using smooth and repeated from Exercise 1.43.

The definition of smooth is quite easy:

(define (smooth f)
  (define dx 0.01)
  (define (average a b c)
    (/ (+ a b c) 3))
  (lambda (x)
    (average (f (- x dx))
             (f x)
             (f (+ x dx)))))

((smooth square) 2)                     ; 4.000066666666666
((smooth (smooth square)) 2)            ; 4.000133333333333
((smooth (smooth (smooth square))) 2)   ; 4.0001999999999995

However, the repeated application of smooth should be written as:

(define (n-fold-smooth f n)
  ((repeated smooth n) f))

((n-fold-smooth square 1) 2)            ; 4.000066666666666
((n-fold-smooth square 2) 2)            ; 4.000133333333333
((n-fold-smooth square 3) 2)            ; 4.0001999999999995

Not as:

(define (wrong f n)
  (repeated (smooth f) n))

((wrong square 1) 2)                    ; 4.000066666666666
((wrong square 2) 2)                    ; 16.00060000444444
((wrong square 3) 2)                    ; 256.01926716889415

The wrong implementation actually expands to:

((smooth square) ((smooth square) 2))   ; 16.00060000444444
((smooth square) ((smooth square)
                  ((smooth square) 2))) ; 256.01926716889415

10.5. Ex 1.45 — n-th root

Comput $\sqrt[n]{x}$ by calculating the fixed point of the function $x / y^{n-1}$ average damped $\lfloor \log_2 n \rfloor$ times.

(define (nth-root x n)
  (define (log2 n) (/ (log n) (log 2)))
  (let ([c (inexact->exact (floor (log2 n)))])
    (fixed-point ((repeated average-damp c)
                  (lambda (y) (/ x (expt y (- n 1)))))
                 1.0)))

10.6. Ex 1.46 — iterative improvement

Several of the numerical methods described in this chapter are instances of an extremely general computational strategy known as iterative improvement. Iterative improvement says that, to compute something, we start with an initial guess for the answer, test if the guess is good enough, and otherwise improve the guess and continue the process using the improved guess as the new guess. Write a procedure iterative-improve that takes two procedures as arguments: a method for telling whether a guess is good enough and a method for improving a guess. iterative-improve should return as its value a procedure that takes a guess as argument and keeps improving the guess until it is good enough. Rewrite the sqrt procedure of Section 1.1.7 and the fixed-point procedure of Section 1.3.3 in terms of iterative-improve.

(define (iterative-improve  good-enouth? improve)
  (define (try guess)
    (if (good-enouth? guess)
        guess
        (try (improve guess))))
  try)

(define (fixed-point f first-guess)
  ((iterative-improve
    (lambda (guess)
      (< [abs (- guess (f guess))] 0.00001))
    f)
   first-guess))

(define (average-damp f)
  (lambda (x) (/ (+ x (f x)) 2)))

(define (sqrt x)
  (fixed-point (average-damp (lambda (y) (/ x y)))
               1.0))

(sqrt 9)                                ; 3.000000001396984

Author	thebesttv
Created	2021-07-04 09:15
Modified	2023-03-21 19:23
Generated	2025-06-28 02:51
Version	Emacs 30.1 (Org mode 9.7.11)
Raw	ch1.org