Monday, March 19, 2012

HTML generation: templates, DSL and embedded scripts compared

Most web sites aren't 100% dynamic like Gmail or Facebook, even in this Web 2.0 era. There is usually one person who writes the HTML+CSS code and, if needed, a bit of client-side code (JavaScript), but he often uses WYSIWYG tools and is not a good programmer. Then another dude, a specialized programmer and data base expert, adds the server side code where it is needed.

PHP fits this division of labour nicely by providing a tag to embed PHP code directly into HTML.

It works, but it's an unreadable mess, quite hard to maintain, especially because of the verboseness of HTML. To address this problem, a lot of template systems have been developed for PHP, in an attempt to separate the HTML from the PHP code. Instead of inventing a new template language and writing a parser and an interpreter/compiler for it, the most flexible template kits just use PHP itself as a template language!

The PHP template file is just require-d or include-d by the PHP script that implements the page business logic. Such PHP templates are editable by webmasters using WYSIYWG HTML editors in a way that allows them to change most presentational details without screwing up any code.

E.g. in a typical PHP-based template you will find HTML code interrupted by <?php ?> sections that either print out values of variables, define a loop or choose HTML code to print conditionally, usually no more than that. Most WYSIYWG HTML editors display them using small icons, as they were external objects embedded into HTML. All that a webmaster has to take care of is, for instance, make sure that all PHP code sections that print values in a loop are put inside the two outer sections that define the loop itself. Any surrounding HTML code can be changed at will.

There are some disadvantages of using templates, though: they are slower and use more central memory when processed on each request. Caching can help ease this drawback, but for highly dynamic resources is not always easy and effective.

E.g. if you are formatting a big table with some frequently-changing data pulled out from a database, you will need to store them all in memory, then pass this structure to a template engine, which will send the generated HTML page to the web server, either at one bound or, if you're lucky, piecewise, but, you still need to keep all your data in memory, if not the HTML code too.

You can't pass a row at a time as soon as it is returned by the database from the business to the presentational logic. In order to do that you would need to run these two layers as separate processes or threads, but then you have to face the complicated issue of process/thread synchronization as well as buffer the data exchanged. It is so overkill that I do not know of any "scalable" template engine that uses multi-processing or multi-threading. In simple cases, which are the majority, it would be slower than single-process template systems.

This is a scenario where templates don't scale. If you were using PHP embedded into HTML, the output could be flushed more often, virtually after each row is printed, and, in the case of a slow database or overloaded server, the user does not have to wait for the whole page to be generated server-side to actually see or begin to see something - assuming there is some browser support for partial table rendering and such. And the user is not always a rendering browser, it could be web crawler like that of a search engine.

Other than embedding PHP code directly into HTML, whether you use templates or just embed PHP code into a tag, you can also generate the HTML code by using another language, that is a DSL which generates HTML. E.g. in Lisp you can use s-exprs so you don't have to close tags - you close parentheses, but at least any editor with Lisp support matches them for you automatically. The result is less messy than HTML code. HTML has a very bad syntax, indeed! By using Lisp you get some syntax checking too, e.g. it is impossible to overlap tags this way - Lisp closing parentheses are all the same, unlike HTML closing tags. By intermixing Lisp forms (that is commands) you can add any dynamic behaviour you want. Lisp provides you with full support for functional programming, that is you can code functions devoid of side effects, which are easy to test and help to implement a modular approach to HTML generation.

If you define your Lisp macro to generate HTML code which returns strings that are concatenated it will use more memory than PHP templates with <?php ?> tags. If you turn them into (write-string) forms, like CL-WHO does, you save memory but it is slower than both some hand-made (format) code to print HTML tags directly from Lisp and, of course, static HTML. Whatever the method you are using, you pay a performance price for printing HTML from your code using a nicer syntax, the cost of converting from the DSL to HTML. Moreover you have to convince the non-programmer webmasters to learn the Lisp syntax and abandon all their WYSIWYG HTML editors. Quite tough, if not impossible, to do on a large scale!

What is more, this DSL-based model works only if the front-end and back-end developer are the same person, with the same skill sets, but, even in this case, it isn't always feasible! E.g. you may need to add a bit of dynamic behaviour to some existent static HTML code. Suppose you don't have time to convert the whole document using your DSL of choice, they do not pay you for that and it would not be worthwhile anyway, since it is a mostly static page. This is why a <?php ?> is not always a bad idea. Even if you have an automatic HTML to DSL convertor, it is still slower to generate all the bulk of the HTML, which is static, using the DSL at each request.

Ideally you should have both ways to generate dynamic HTML code at hand in a server-side language and choose one case by case. If the page is highly dynamic, go for the DSL, otherwise use templates if they give you roughly the same performance as directly embedded PHP.

The problem with PHP is that it does not make easy and efficient to develop flexible DSLs for HTML generation. This is where Lisp, with its powerful macros, can provide an added value. But AFAIK Lisp-based server solutions do not provide anything like the PHP tag <?php ?>, for quick & dirty dynamic additions or PHP-based template implementations, apart from a small project mod_ecl which has been abandoned.

No comments: