Tuesday, December 13, 2011

Starting out with Lisp

Tools and tutorials


Windows


You can still run CLISP if you want, although you better change your OS for a real one :)

Linux


www.clisp.org

Your distro may have CLISP already packaged. After installation, enter the interpreter with:

$ clisp

Use (load "/path/to/file.lisp") to load an external Lisp program or you can type your code directly at the prompt.

Mac OS X


Download Allegro CL Express Edition for Mac OS X. Install it with:
$ sudo cp -R /Volumes/AllegroCL\
/AllegroCL/Applications/
$ sudo /Applications\
/AllegroCL/newlicense
Load the interpreter/compiler with:
$ /Applications/AllegroCL/alisp
To avoid too much typing you may want to add this directory to your PATH or make an alias in your ~/.bash_profile.

Tutorials


If you already know C, you should read this very interesting article comparing Lisp and C.

A good resource to start learning Common Lisp is this course materials from Simon Fraser University. Begin with Tutorial 1.

If you'd rather take it easy, then www.lisperati.com is for you!

My intro: Tower of Hanoi

My first simple Lisp (LISt Processing language) program solves the Tower of Hanoi game:

(defun hanoi (n source dest by)
  "Recursive solution of Tower of Hanoi"
  (labels (
      (move (d q source dest by)
        (if (= q 1)
          (format t "~&move disc #~D from ~A to ~A~%" d source dest)
          (progn
            (move (+ d 1) (- q 1) source by dest)
            (move d 1 source dest by)
            (move (+ d 1) (- q 1) by dest source)
          )
        )
      )
    )
    (move 1 n source dest by)
  )
)

Lisp syntax is very simple but must be understood since it is different from many other non-functional languages. Although at first sight it may seem strange and limited, so abounding in parenthesis, it is actually simpler, neater and more powerful than any other syntax-involved language.

Everything in Lisp is an expression and returns a value, there is no artificial distinction between expressions and statements, because it is better to make no difference in this respect. An expression can be an atom (a number, variable name, string, character, etc, all evaluating to themselves) or a list, enclosed by round brackets and interpreted as a command. Unless Lisp is told that a list is just data, it evaluates it as the result of a function call. While in C you write:

func(arg1, arg2)

in Lisp this is

(func arg1 arg2)

There are no commas and the function name slides into the parenthesis, but this is only a superficial way to look at it. The real power of this notation comes from the fact that code just looks like data and this enables powerful macro facilities in Lisp, much more powerful than #define macros in C!

Even a function definition in Lisp is a list: a defun command or special form with a side-effect of defining a new function, with the following syntax:

(defun function-name (arguments) function-body)

where function-body is a sequence of expressions making up the function definition. Only the value of the last expression will be returned as the function value. There is no special syntax for operators, they are just functions, as in mathematics. And thanks to delimiting parenthesis and Lisp prefix notation, where the operator comes first and then all its operands, they can easily accept more than two arguments. E.g. 1<a<3 can be translated directly into (< 1 a 3), while in other languages (e.g. Java) you have to write (1<a && a<3). You can have any number of operands for < and similar operators to build a chain of inequalities.

In the program above the hanoi function's body is an (optional) documentation string followed by a local function definition. The syntax for defining some local functions at the current scope is:

(labels ((fun1 (arg1) body1) ... (funn (argn) bodyn)) body)

fun1 ... funn are functions visible only to "body", which is a sequence of expressions just like function-body. In this case we only need one local function which we named "move" and the labels' body is a single command to invoke this function:

(move 1 n source dest by)

The first parameter to move (d) is the recursion level which corresponds with the disc number every time we are ready to move one disc, provided we agree in numbering discs in ascending order from the bigger to the smaller. You can see from the above call to "move" that this recursion level is initialized to 1. The second argument of move (q) is the quantity of discs to move and is initialized to n. This n is (the value of) a parameter of hanoi, which could also be accessed inside move if we wanted but we do not need that.

By the way notice that Lisp is dynamically typed, so we don't need to declare any variable type. This has pros and cons, but it combines nicely with other Lisp features that enable generic programming with no frills.

Towers of Hanoi is a problem that is best solved using recursion and functional programming. Even if you are using a non-functional programming language, it would be only cumbersome to code the solution in an iterative way, since the use of an explicit stack would be needed anyway.

Once you get used to it, thinking recursively is not difficult. I claim that recursion is actually easier than iteration and should be taught to pupils since primary school. Sure enough, you can read out the following recursive code:

(move (+ d 1) (- q 1) source by dest)
(move d 1 source dest by)
(move (+ d 1) (- q 1) by dest source)

simply as: move q-1 discs from the source rod to the by one, using the dest rod as a foothold. How it does that is not a problem, since it can use the same algorithm, although we still have to finalize its definition. This is the only tricky part of recursion! Once you do that you are left with only one disc in the source rod you can safely move to the dest rod (no "by" rod needs to be used this time, but we must still pass "by" to move in order for the call to be consistent). Now we have q-1 discs on the by rod, only 1 on the dest rod, which is in the right place, thus we just need to move the former to the dest rod using the source rod for support and we're done.

Indeed, this definition makes sense if we also say how to solve the q=1 problem in a non-recursive way. The solution for q=1 is trivial and it's the purpose of that "if" form to print it out.

For recursion definitions to make sense and lead to a program that terminates, they cannot go on forever. They must end on a banal case, and indeed this is the power of recursion: it reduces a bigger problem to a smaller and smaller one until its solution becomes apparent. All the pieces can then be put back on together to build the solution to the whole problem.

And the way we program it is by just stating what has to be done so that thinking recursively is actually simpler than thinking iteratively. Why? Because iterations are based on state. You have to follow up how variables are changed in order to understand an iterative algorithm. It is not a natural way to describe a computation and often you will need pencil and paper to fully comprehend a program you are reading. As programs get bigger and bigger, state information grows as well. Even if using the object-oriented paradigm, state usually stalks big parts of code, bigger than you can manage. Objects interact among themselves in complicated ways: an object sends a message to another object changing the state of the latter and sometimes of both. Making the execution thread of a program gives you a headache. This will never happen with functional program. You can digest a program piece by piece, continuing tomorrow without the need to jot down where execution left.

In Lisp you can use the object-oriented paradigm but it is discouraged to use it as the only way to structure your programs. Lisp instead supports many different programming paradigms, emphasizing functional program in particular. You build the bulk of your program as a bunch of functions who behave like mathematical ones: they contain no side-effects, viz they neither output or input anything or change variables so that the value they return could depend on time. A return value from a function should only depend on the actual arguments, just as in mathematics. Only a minority of functions or objects in your program should have side-effects, where absolutely required.

Programming this way is initially a bit more difficult, until you get used, but leads to programs that have better properties. Generally program execution is just a process performed by a machine and introducing a dependency on time in that process makes it harder to follow, understand and debug by humans. Iterative programming is a big mistake! It may seem apparently more straightforward, but it is not and you pay a high price for it.

Back to our sample program now! I still have to tell you how format, a printf cousin, works. The first argument of format is the destination. T stands for *STANDARD-OUTPUT*. Second argument is the format control string. A ~ character is used to introduce formatting directives. The ~D directive formats an integer as decimal digits with a leading minus sign if the number is negative. ~% is interpreted as a newline, while ~& ensures output starts on a fresh line. Non-numeric Lisp data is printed using the ~A directive. Follows a list of zero or more arguments to be used by the control string to produce formatted output.

We are now ready to give it a try:

CL-USER(2): (hanoi 1 "A" "C" "B")
move disc #1 from A to C
NIL
CL-USER(3): (hanoi 2 "A" "C" "B")
move disc #2 from A to B
move disc #1 from A to C
move disc #2 from B to C
NIL
CL-USER(4): (hanoi 3 "A" "C" "B")
move disc #3 from A to C
move disc #2 from A to B
move disc #3 from C to B
move disc #1 from A to C
move disc #3 from B to A
move disc #2 from B to C
move disc #3 from A to C
NIL

If you are wondering where that NIL comes from, it is just the return value from labels that the interpreter is printing out.

A different implementation using a sort of switch loop (cond in Lisp) is here. In what is this different? Mainly because it changes the recursion base case: n, the number of discs to move is allowed to reach 0, in which case the function dohanoi does nothing. Btw dohanoi is no more a local function and since cond is used, a progn block form is not needed as with if.

Maybe this is not a great use of cond, since there is only one condition to test. Using a base case of 0-discs for the recursion, I would prefer to rewrite the program as follows:

(defun hanoi (n source dest by)
  "Recursive solution of Tower of Hanoi"
  (labels (
      (move (d q source dest by)
        (unless (<= q 0)
          (move (+ d 1) (- q 1) source by dest)
          (format t "~&move disc #~D from ~A to ~A~%" d source dest)
          (move (+ d 1) (- q 1) by dest source)
        )
      )
    )
    (move 1 n source dest by)
  )
)
This makes the move function a bit more robust: it just does nothing if by mistake q (that is n in hanoi) is less than or equal to 0. progn is now unnecessary because, unlike "if", unless accepts a sequence of expressions as its body, having no "else" case. The overall code is a bit more compact, but don't be fooled by the reduced number of recursive calls to move (2 instead of 3): this version does make more recursive calls, e.g. for a problem of dimension 4, only 31 calls instead of 22, but asymptotic complexity is the same.

No comments: