By Dmitri Popov
A recent concept called the semantic web  is building new meaning into ordinary HTML text. Previous issues of Linux Magazine have discussed semantic web initiatives, such as the Simile Project . One of the simplest and most mature semantic web technologies is microformats.
Microformats are simple HTML tags that reveal information about the meaning and context of web data. For instance, a microformat tag might specify that text is intended to be part of a resume or business card.
A browser designed to recognize the microformat then interprets that data accordingly. The browser might display the data to resemble a business card, or, perhaps more importantly, a client-side application could extract the contact information and then copy it to a user's address book.
Microformats are also useful in situations in which a single web client must integrate information from several sources. For instance, if you use a microformat tag to embed the latitude and longitude coordinates for a restaurant, a browser with the necessary plugins could automatically plot the restaurant location on a Google map.
Like other semantic web technologies, microformats do not require the web developer to know in advance how the client will use the data. A microformat specification merely defines a format for associating text with context.
The recent Operator plugin and an earlier plugin called Tail bring microformat support to Firefox. In this article, I will introduce you to the world of microformats and take a close look at some of the more popular microformat options, such as hCard , hCalendar , and geo . Some other common alternatives are shown in Table 1.
For a complete description of all microformat specifications, see the microformats wiki .
To explain what microformats are and how they can help you, I'll start with a simple example. Take a look at the following text:
Linux Pro Magazine
719 Massachusetts Street
The human eye immediately recognizes this text as an address. You can see the street name, the city, and the country. You know that USA is a country and Kansas is a state. Unfortunately, computers don't (yet) have the ability to understand information just by looking at it; they need a set of markers that identify the data. Microformats can help in this situation. Unlike other similar technologies, microformats don't attempt to provide a universal solution to identifying and marking all sorts of data. Each microformat is designed for a specific situation. As I will explain later, you can use the hCard microformat to specify contact information such as this address.
If you have ever worked with HTML and CSS, you will have no trouble understanding how microformats work and how to use them. The three main building blocks of microformats are the div, span, and class elements, which are all-purpose tools for adding structure to documents.
The div and span elements define divisions or sections in a document. Whereas the div element defines a section of a document similar to a paragraph, the span element is used to mark a segment directly in the text body. When used with the class attribute, the div and span elements can be used to define types of information that can't be described by HTML. For example, in an article, <div class="author">Dmitri Popov</div> may be used to indicate the author's name, and <span class="pubdate">August 17, 2007</span> may be used to indicate a publication date. Both div and span can be used to add semantic markers to a section or a text segment of a document.
Simpler microformats such as Rel-License and Rel-Tag use another element called rel. In HTML, the rel element describes the relationship from the current document to the anchor specified by the href attribute. For the sake of simplicity, you can say that the rel element describes the resource that the href link points to. Knowing that, you can easily figure out that, in the case of Rel-License, the rel element describes the link to a particular copyright license.
Rel-License and Rel-Tag are perfect examples of how easy it is to add microformats to your existing content. For example, suppose you have published an article under the Creative Commons Attribution-Noncommercial 3.0 license. In this case, you are most likely to provide a copyright note linking to the particular license:
<a ref="http://creativecommons.org/licenses/by-nc/3.0/"> Creative Commons Attribution-Noncommercial 3.0</a>
This link is good old HTML, but you can easily microformat it by simply adding the rel="license" attribute to it:
<a rel="license" href="http://creativecommons.org/licenses/by-nc/3.0/"> Creative Commons Attribution-Noncommercial 3.0</a>
As you can see, turning the link into a microformatted copyright notice is not particularly difficult. The immediate advantage is that by adding the Rel-License microformat, you make the content of a web page available for search on the basis of license type.
Both Yahoo! and Google are aware of the Rel-License format and allow you to search for content on the basis of its license type. For example, Yahoo! has a dedicated Creative Commons search page  (Figure 1), whereas Google allows you to specify a license type in its Advanced Search section. The Rel-Tag microformat works in a similar way.
If your article describes wiki basics, for example, it makes sense to tag it with the "wiki" tag as follows:
<a href="http://en.wikipedia.org/wiki/Wiki" rel="tag">wiki</a>
This microformatted link consists of two parts. The destination of the link is called the tag space, and the part of the link after the last forward slash (/) is called the tag value.
The tag space is "a place that collates or defines tags" , which means that the tag space should link to a place that provides a specific meaning of the tag. In the example shown above, the link to the Wikipedia article provides the best possible (from the author's point of view) explanation of the wiki tag.
You can think of the hCard microformat as an XHTML representation of the vCard format, a widely accepted format for exchanging contact information between applications. Although hCard is more complex than Rel-License and Rel-Tag, it is still easy enough to understand. Here is what the previously mentioned address looks like in an hCard-formatted form (see Listing 1).
All the formatting in this example should be obvious, and if you don't feel like formatting your existing contact info by hand, the handy hCard Creator  can do this for you (Figure 2).
|Listing 1: Example of the hCard Microformat|
01 <div class="fn org">Linux Pro Magazine</div> 02 03 <div class="adr"> 04 05 <div class="street-address">719 Massachusetts Street</div> 06 07 <div> 08 09 <span class="locality">Lawrence</span>, 10 11 <abbr="region" title="Kansas">KS</abbr> <span class="postal-code">66044</span> 12 13 </div> 14 15 <div><div class="country-name">USA</div> 16 17 </div> 18 19 <div>Phone: <span class="tel">+1-785-856-3081</span></div> 20 21 <div>Email: <span class="email">email@example.com</span></div> 22 23 </div>
The hCalendar microformat handles calendaring data. Like hCard, hCalendar uses self-explanatory formatting (see Listing 2).
A microformat-ready browser that encounters the hCalendar format can seamlessly integrate the data with other calendar and event information. The so-called abbr design pattern is used in the formatting to embed data without making it visible on the page. In this case, abbr is used to embed the start and end dates of the event.
|Listing 2: Example of the hCalendar Microformat|
01 <div class="vevent" id="hcalendar-LinuxTAG"> 02 03 <a class="url" href="http://www.linuxtag.org/"> 04 05 <abbr class="dtstart" title="20070530">May 30th</abbr> — <abbr class="dtend" title="20070603">June 3rd, 2007</abbr> 06 07 <span class="summary">LinuxTAG</span>, <span class="location">Berlin</span> </a> 08 09 <div class="description">Linux Expo and Conference</div></div>
The geo microformat allows you to encode latitude and longitude data into your web content:
<span class="geo"> <span class="latitude">39.00505</span> <span class="longitude">-95.23297</span> </span>
If you add this data to a web page, the Firefox Operator plugin, which I describe later in this article, will map the location with Google Maps. Another way to use the geo microformat is to embed geographic data (geodata) directly into the web content with the abbr element. For example, if you are blogging about your recent trip to Berlin, your blog post might look something like this,
<abbr class="geo" title="52.51191;13.38519"> Mohren strasse</abbr>
which embeds a reference to my favorite street in Berlin.
To see what you can do with the tagged web page, you need the Operator extension for Firefox. You can download Operator from the Add-ons section of the Firefox website . Operator was developed by Michael Kaply, who describes it as "an extension for Firefox that provides interoperability between microformats and various web services." In other words, the Operator extension is the tool that actually puts microformats to some practical use by acting as a mediator between microformatted content and web-based services that can process it. For example, Operator can feed hCard-formatted data to the Google Maps service, which uses it to locate the address on the map.
When installed, Operator adds a toolbar containing different tools. Point your browser to a page containing microformats and you can use Operator to perform different actions on the microformatted content. For example, if you open a blog post containing tags, Operator will automatically detect them and activate the related tools. You can use Operator to search Flickr photos, del.icio.us bookmarks, and blogs on Technorati containing a specific tag.
Previously, I showed you how to embed geodata into a web page. Because Operator can handle the geo microformat, you can use it to display the specified geodata on Google Maps, but that is not all. For example, the Flickr photo gallery site lets you use the geo microformat to add your photos to the map (Figure 3). Click on the Map tab in Organizr and place your photos on the map. This automatically adds geocoding to the photos. Now if you view the mapped photo, Operator will allow you to map it on Google Maps with the use of embedded geocoding (Figure 4).
If you have microformatted contact information embedded into your web page or blog, users can extract it easily. To export the contact information, you can use Operator or Tails Export , which is another Firefox extension that can process microformatted content. Tails is not as flexible as Operator, but it is quite useful. When you click on the Tails icon in the browser status bar (the icon turns orange when Tails detects microformatted content on the page), you can see a nicely formatted list of all the available contacts and events (Figure 5, left column). You can then export and add them to your address book or calendar.
These examples are just a few samples of what you can do with microformats, but you don't have to stop here. Other microformats, including hReview and hResume, could also be useful for your particular needs, and additional tools can help you manage your microformatted content. For example, there is an hReview plugin for WordPress  and a few user scripts that can make the Operator extension even more useful .
Finally, the book Microformats: Empowering Your Markup for Web 2.0  covers everything you would want to know about microformats and how to use them.
 Semantic web: http://en.wikipedia.org/wiki/Semantic_Web
 "Tomorrow's Toolbox: Semantic web tools of the Simile Project," by Oliver Frommel, Linux Pro Magazine, August 2007, pg. 56
 hCard: http://microformats.org/wiki/hcard
 hCalendar: http://microformats.org/wiki/hcalendar
 geo: http://microformats.org/wiki/geo
 Microformats wiki: http://microformats.org/wiki/Main_Page
 Yahoo! Creative Commons search page: http://search.yahoo.com/cc
 Rel-Tag wiki page: http://microformats.org/wiki/rel-tag
 hCard Creator: http://microformats.org/code/hcard/creator
 Firefox Operator add-on: https://addons.mozilla.org/en-US/firefox/addon/4106
 Firefox Tails Export add-on: https://addons.mozilla.org/en-US/firefox/addon/2240
 hReview plugin for WordPress: http://www.aes.id.au/?page_id=28
 Operator user scripts: http://www.kaply.com/weblog/operator-user-scripts/
 Allsopp, John. Microformats: Empowering Your Markup for Web 2.0. Friendsof, 2007: http://www.friendsofed.com/book.html?isbn=1590598148