LJ Archive

YODL or Yet Oneother Document Language

Karel Kubat

Issue #43, November 1997

Would you like to publish the texts you write in more than one format—PostScript for printing, plain ASCII for e-mail, and HTML for the a web page? YODL could be just the language you need.

I wrote the first version of YODL in late 1995 and early 1996, as a general document language for my personal use then named DOM, for “document maintainer”. I was dissatisfied with the SGML converter I was using at the time, and wanted to write a quick-and-dirty program that would read a document and convert it either to LaTeX (for printing PostScript) or to HTML.

Unfortunately, a quick-and-dirty program, like a boomerang, has a way of flying back at you. As soon as I began to use YODL, I told my colleagues at work about it, and they were immediately ready to use it too. Soon I was rewriting YODL, implementing new features, writing converters for other formats and documenting it all. YODL has evolved from a private program to the document language everyone in my department uses. I consider YODL ready for the world in its current “beta” state. In this article I'd like to introduce YODL and tell you why it is a handy program to have around.

Obtaining and Installing YODL

Design Specifications of YODL

First, let me explain my goals in designing YODL. Most importantly, I wanted YODL to become a document language that would be intuitive and easy to use, unlike many common document languages such as LaTeX, HTML and some current SGML implementations.

In each of these languages, certain characters are special. In LaTeX, for example, you can't simply enter $%\* and expect it to appear in your text. In HTML you must enter < as < to produce the < character. In my version of SGML you must enter an ampersand (&) as either & or as &ero;, depending on whether you want the ampersand to appear in normal text or in a listing—which is horrible. On the other hand, all characters in YODL appear in the output as you entered. My idea is this: just type away, and whatever you type goes into the output. YODL implements translations such as these with character conversion tables. For example, the LaTeX conversion table specifies that a * in the input results in a $*$ in the output. (You could even create a character conversion table stating that aa should lead to bb, bb to cc, and so on; which might produce interesting output.)

“Whatever you type goes to the output” is, of course, relative. Each document language must provide a way to put commands in the text to do things such as change the font, start a new chapter, etc. Typing commands in LaTeX, HTML or SGML is, in my opinion, awkward. For example, typesetting text in boldface requires that you enter {\bf text}, <strong>text</strong> or <bf>text</bf>. I don't know about you, but my fingers always get stuck when typing the curly braces, backslashes and smaller-than or greater-than characters. If you program on a regular basis, you will probably agree that typing parentheses is easier. Therefore, I chose to define a command in the YODL language as a macro name, followed by arguments in parentheses, as in bf(ext). This syntax looks like a C-style function with an argument list, except that macros having more than one argument will have each argument within separate parentheses. Another advantage of parentheses is that many editors have a “match-mode” that highlights pairs of these characters making the typing of text much easier.

As for the “usefulness” of a document language, however arbitrarily measured, I find that such a language must support at least automatic numbering of sections, labels in the text and references to them, and it must create a table of contents. Respect where respect is due: these features (and lots more) are implemented in LaTeX. I, therefore, chose to let YODL “emulate” this feature in other output formats. When YODL converts a document to HTML, it creates a clickable table of contents and numbers its sections.

By design, the YODL package consists of one generic program that is able to process simple commands—this is the yodl program. This program is not a real converter by a long shot but just a first phase. The bare yodl program knows about commands such as INCLUDEFILE (to read in a file), DEFINEMACRO (to define a new macro), IFDEF (for conditional execution), etc. A real converter uses yodl for its first phase but supplies macro files and character conversion tables for a given format. For example, yodl2tex which converts a YODL document to LaTeX, loads an appropriate macro file specifying that bf(text) is in LaTeX {\bf text} and that a $ character must be typeset as $\dollar$. Some converters, like the HTML converter, require a post-processor for specific actions, e.g., to resolve labels and references and to create a clickable table of contents. Normally, the user is not aware of such peculiarities: simple shell scripts (such as, yodl2html, yodl2tex, yodl2txt) run the yodl program, supply the right macro files and, if necessary, start post processors.

The last design consideration I want to mention is that situations can arise in which you must use commands in a given output language (LaTeX, HTML or whatever) to accomplish special goals. Although YODL can be used without knowledge of the output format, the final document language is by no means hidden. The macro package implements the commands latexcommand(cmd), htmlcommand(cmd) etc., which are hidden. The macro package implements only the commands active for their output format. This means that all the nifty features of YODL can be used for the “standard” things, while you can always fall back on the specific commands of the final output language for special features.

Using YODL

The macro package that comes with YODL defines four main types of documents: articles, reports, books and man pages. A YODL source file must always mention what type of document is being processed. The difference between the document types is that different top-sectioning commands are defined for each (e.g., an article has no chapters, a report has no parts and a man page has its own specific sections) and how the HTML converter splits the resulting HTML file into different sub-files. The HTML output is split by chapters, which are reached via a clickable table of contents and by “next chapter” links. The statement that defines the document type also sets the document title, the author and the date. (If you're familiar with LaTeX then all this probably rings a huge bell. Yes, I'm guilty of “borrowing” wherever I can.)

Before stating the document type, several optional commands can be specified to alter the appearance of the document. Examples are mailto, a macro that sets the default e-mail address for HTML output, or htmlbodyopt, a macro that is used to define the foreground or background colors. A sample document could therefore be started as:

htmlbodyopt(fgcolor)(#0000B7)
mailto(karel@icce.rug.nl)
report(XWatch: Watcher of logfiles
        under an X session)
        (Karel Kubat)
        (1996)

These statements would be followed by the document. Reports are divided into chapters, chapters into sections, sections into subsections, etc. When labels are used following the sectioning commands, references can be made to those labels. In HTML the references are clickable links:

chapter(Introduction)
This is the introductory chapter. Specific
information about the installation is in
chapter ref(install).
chapter(Installation) label(install)
This chapter describes the installation in
detail.
As an example of user-defined macros, consider the following. Let's say that you want to typeset 1/4 either as shown here, or as a fraction with the 1 above the 4. The layout of this fraction will depend on the output format, with only LaTeX supporting the latter notation. The way to accomplish this would be to define a new macro, say quart:
DEFINEMACRO(quart)(0)(\
  latexcommand(\frac{1}{4})\
  txtcommand(1/4)\
  htmlcommand(1/4)\
  mancommand(1/4))
This statement defines a macro called quart having zero arguments. Macros can have up to thirty-five arguments. The macro expands to a command in each output format, LaTeX, HTML, man or plain ASCII—one per conversion. Now, the new command can be used as:
a quart is quart() gallon
Also, note that the backslash character can be used to split long lines into readable source code, as in shell script programming. It's just one more feature that I considered handy and, therefore, implemented.

The YODL language has tons of other useful macros—there's not space for a complete enumeration. However, here's a couple just for the fun of it. You can use footnotes in a document by specifying:

footnote(

Depending on the output format, the text is either a real footnote or it is placed in the running text. Similarly, the macro:

url(
puts a clickable URL in a HTML document or mentions the description and location in other types of output. And there's lots more, including the way YODL handles accent characters. All can be found by reading the documentation that comes with the YODL archive.

YODL's Macro Files

What YODL Still Needs

As I said before, I now consider YODL to be in a beta state, and I've released it to whoever wants it. One of the great things about the Linux community is that it is a pool of millions of beta-testers. I expect YODL to continue to evolve over the next few months.

In any case, what YODL still needs is a flawless converter to plain ASCII via the groff route. I wrote two converters: one that interfaces to the man format and one that interfaces to the ms format. That's at least a good start; though the man and ms packages can do more than just what I've picked out. Furthermore, the HTML converter only supports “standard” HTML, no nifty features such as frames, although there is some rudimentary table support. And, of course, converters to other formats would be welcome, like a converter to the texinfo format for the creation of info files, etc.

All in all, YODL has already proven itself in our department, where almost all documents that accompany our software are now written in the YODL language. I hope you won't find YODL worthwhile looking at, or I'll be swamped emptying my mailbox for the next few months and improving the YODL package. All kidding aside, I hope this article has tickled your curiosity enough, that you'll try out YODL and discover its usefulness.

Karel Kubat lives in the northern part of the Netherlands with his family, cats and 1967 Volvo Amazon. He's a full-time Linux fanatic, who can be reached via e-mail at karel@icce.rug.nl.

LJ Archive