Friday, December 2, 2011

A taste of functional programming for initiates

What's functional programming? Is it just a passing hype or there is something good with it? I want to show you how easy and fun is to decompose problems using mathematical functions as much as you can, by a very simple example.

Let's consider the problem of demonstrating that the sentence "The quick brown fox jumps over the lazy dog" actually contains all the 26 letters of the English alphabet. First, solve this problem in your favourite imperative or object-oriented programming language and then compare your solution with the following one, written in standard Common Lisp:

(sort (remove-duplicates (string-downcase (remove #\space "The quick brown fox jumps over the lazy dog"))) #'string<)

This program may seem weird and unreadable, especially if you are new to Lisp, but if you put all your imperative code that does the same thing on one line I bet it gets even worse. Of course, you can always make use of some proper indentation to make your programs more readable:

(sort
  (remove-duplicates
    (string-downcase
      (remove #\space "The quick brown fox jumps over the lazy dog")
    )
  ) #'string<)

A good text editor will help you to properly indent your program and balance parenthesis. Lisp has been unfairly criticized because abounds in parenthesis, by people who don't even know it well. If you learn Lisp, you will soon realize that brackets are there for keeping the syntax simple and flexible.

After all, imperative languages use a lot of different parenthesis as well! For instance think of braces (curly brackets) to define code blocks, round brackets to enclose conditions, square brackets to index arrays etc. Not to mention semicolons, used as statement terminators - something that is unnecessary in Lisp. Isn't that worse? A lot of syntax with no added benefit, considering these languages are neither homoiconic, nor full-featured nor fully extendable as Lisp is.

It doesn't seem Java or C++ code looks prettier than Lisp code, but even if it did, who really cares? We only care about power and expressiveness in a programming language, don't we? Simple and powerful is better than elaborate and limited! Lisp has a syntax so simple you can learn it in a minute yet you can do everything that can be done in all the other syntax-involved languages and even more.

The output of our whole program is:

"abcdefghijklmnopqrstuvwxyz"

Wonder why output string are quoted? Lisp output can be used as input, but facilities are offered to output in any other format as well.

If you install a Common Lisp implementation you can try out each sub-expression at the REPL prompt. REPL stands for read-eval-print-loop and it's a Lisp interactive interpreter you can play with. I suggest you use CLISP which is open-source and available for all main platforms. Start with the innermost form:

$ clisp
...
[1]> (remove #\space "The quick brown fox jumps over the lazy dog")
"Thequickbrownfoxjumpsoverthelazydog"

Here we are just stripping blanks off a string. If we hadn't remove we could construct one this way, at least in this case:

(remove-if (lambda (x) (eql x #\space))
  "The quick brown fox jumps over the lazy dog"
)

This may seem convoluted, but it is easily explained: we define an anonymous function to check whether something is a blank character. In Lisp a function without a name can be created on the fly and passed to other functions as an argument or returned as a value from another function. remove-if is a generic function that accepts a function as its first parameter. This function is used as a predicate, that is its return value is interpreted as either true or false. remove-if removes all elements that satisfy this predicate from its second argument and returns the result as a new sequence. No side-effects here! It's generic, you can code any predicate you want.

This is what the OO buffs - or should I say goons - call the "command pattern". It's just a clumsy way to implement the same thing being limited by an inferior programming language. Indeed, not only clumsy, but less powerful too! This is because in Lisp you can nest functions and lambda functions can have unbounded variables, that is access variables in the enclosing scopes apart from its own parameters. These variables are automatically garbage-collected. You just have to think of them as free variables like in mathematics. Why shouldn't we program as we think? Who cares how the computer works? It's better to abstract from that.

The sort function is generic too: its first argument is any sequence (list, string, array) and the second is a comparison function of two arguments, a predicate. Before we used the predefined function string<, which is the specific Lisp operator for comparing strings alphabetically. There is no difference between predefined operators and your own functions in Lisp. Even the < operator for numbers is nothing else that a function we could pass to sort as #'<. #' is just a way to tell Lisp that we want to refer to a function by name and Lisp should not try to evaluate this symbol. For instance we can use sort to sort characters within a string:

[1]> (sort "GCT" #'char<)
"CGT"
or sort a vector (array) of strings by character count, that is sort the elements in order of count of characters:
[2]> (sort '#("Now" "is" "the" "time" "for" "all" "good" "men" "to" "come" "to" "the" "aid" "of" "their" "country.") (lambda (a b) (< (length a) (length b))))
#("is" "to" "to" "of" "Now" "the" "for" "all" "men" "the" "aid" "time" "good" "come"
  "their" "country.")
If you want a stable sorting, just use stable-sort instead of sort. We could even dare more and define remove using remove-if:
(defun my-remove (item seq)
  (remove-if
    (lambda (x) (equal x item)) seq))
This is indeed a simplified version of remove, but illustrates how we can define a lot of specialized and yet reusable tools. Programming with first-class functions is awesome! See the power of having first-class functions? Languages that limit how functions can be used are not good! Since we call pangrams such phrases that contain all of the letters of the alphabet, it makes sense to make our code reusable by defining a predicate to check for "pangramness":
(defun pangramp (string)
  (string=
    (sort
      (remove-duplicates
        (string-downcase
          (remove #\space string)))
      #'string<)
    "abcdefghijklmnopqrstuvwxyz"))
Here's how to use it:
[2]> (pangramp "The quick brown fox jumps over the lazy dog")
T
[3]> (pangramp "Not a pangram")
NIL

You may want to make this function more sophisticated, e.g. stripping not only spaces but any kind of blank character like tabs, newline and even punctuation characters (including dashes), supporting many languages with different alphabets and providing a switch to check for perfect pangrams. A pangram is said to be "perfect" if no duplicate letters occur, that is it has the same length of the alphabet. Btw, you can find a good list of pangrams in several languages on wikipedia.

If you want, Common Lisp supports imperative programming as well: you have even more powerful way to keep and change state (aka variables) than in most imperative functions, but these are features you have to use carefully, only when needed. E.g. Lisp assignment operator setf is more powerful than Java's = in that it supports generic setters. You can set a value inside a data structure as complex as you want by using the same code you use to get that value, whilst in Java you need to duplicate code to implement getters and setters.

And on the OO side, LISP is even more powerful than any other OO language, but wisely encourages you to use the OO paradigm only when appropriate.

Moreover, as we saw, Lisp has full support for generic programming. E.g. remove-if works on strings, lists and arrays, all the ways to sequencing data that are available in Lisp. Same function, same simple syntax. And you can easily write your own generic code too. Unlike minor languages, everything that the standard Common Lisp library does can be done by your code too.

Lisp is both a script and a system language. Very few scripting languages take advantage of the concept of lists like Lisp does. Lists are really Lisp's workhorse and there are a lot of powerful facilities to process them, especially in a functional, state-less way.

I've just touched the surface of Lisp here, but I hope I whet your appetite. The truth they concealed is Lisp is a powerful, standardized, expressive and efficient industrial-strength language. It has all the features you can find combining all the other less-powerful programming languages hyped today, yet in one simple language.

Once you get used to functional programming, you will find out that it is a more straightforward and natural way to code algorithms in. You get out better properties from your code, in terms of correctness, reusability, clarity without introducing unnecessary structure as it is often the case with OOP.

Learn Lisp and functional programming and use it everyday! You will be a better, more productive programmer and will like your job more. It will also give you an edge over competitors using less powerful languages. Two readings I recommend are:

No comments: