The Lisp book introduces two languages with the expression formats - the language itself, comprised of symbolic expressions, and a meta language employing meta expressions. My plan is to implement both languages and bootstrap the whole thing from the definitions in the meta language on page 13.

I stay with the book structure, which follows the definition of Atoms with one for Symbolic Expressions, or S-Exps or sexps - the famous Lisp grammar with all the parenthesis. However, first I need to finish some tests for the atoms - I committed (and cpublished the previous article ;-)), but the book had more examples and one of the idea is to have all the examples from the book be used as test input. So, quickly add some testcases:

parsing("APPLE") should equal(Atom("APPLE"))
parsing("EXTRALONGSTRINGOFLETTERS") should equal(Atom("EXTRALONGSTRINGOFLETTERS"))
parsing("A4B66XYZ2") should equal(Atom("A4B66XYZ2"))

Everything green, so work on symbolic expressions can start. First, make sure that atoms can be parsed.

// "All S-expressions are built out of atomic symbols and the punctuation marks "(", ")", and ".".
they should "parse an sexp" in {
  implicit val parserToTest = sexp
          
  parsing("ATOM") should equal(Atom("ATOM"))
}

(To make it easier to connect book and test code, I am quoting from the book before I write a test. Yes, I’m tempted to go all-out literal programming here, but typing in the whole thing would slow me down too much :-))

The test runs green with a very simple parser:

def sexp: Parser[BaseSexp] = atom

and replacing the AST with:

sealed abstract class BaseSexp
case class Atom(name: String) extends BaseSexp

On to the meat - we need to parse a sexp. The simplest test (again, from the book), looks like:

parsing("(A . B)") should equal(Sexp(Atom("A"), Atom("B")))

and in order to satisfy it, the parser is extended:

def sexp: Parser[BaseSexp] = "(" ~> sexp ~ "." ~ sexp <~ ")" ^^ { case x~dot~y => Sexp(x, y) } | atom

This compiles when we create a case class to hold S-expressions, giving:

sealed abstract class BaseSexp
case class Atom(name: String) extends BaseSexp
case class Sexp(car: BaseSexp, cdr: BaseSexp) extends BaseSexp

The recursive parser definition is powerful enough to parse the other book examples, so just adding them completes the basic S-exp parser:

parsing("(A . (B . C))") should equal(Sexp(Atom("A"), Sexp(Atom("B"), Atom("C"))))
parsing("((A1 . A2) . B)") should equal(Sexp(Sexp(Atom("A1"), Atom("A2")), Atom("B")))
parsing("((U . V) . (X . Y))") should equal(Sexp(Sexp(Atom("U"), Atom("V")), Sexp(Atom("X"), Atom("Y"))))
parsing("((U . V) . (X . (Y . Z)))") should 
    equal(Sexp(
            Sexp(Atom("U"), Atom("V")), 
            Sexp(Atom("X"), Sexp(Atom("Y"), Atom("Z"))))) 

It’s fun to look back at the full parser definition at this point:

trait SymbolicExpressionParsers extends RegexParsers {

  def sexp: Parser[BaseSexp] = "(" ~> sexp ~ "." ~ sexp <~ ")" ^^ { case x~dot~y => Sexp(x, y) } | atom
  
  def atom: Parser[Atom]  = regex("[A-Z][A-Z0-9]*"r) ^^ { new Atom(_) }
}

Just two lines of code implement a working Lisp parser!

The next installment will start with the meta language by implementing the basic function for constructing lists.

this article discusses this version of the code