Generating graphics, charts and graphs for your web site is easy following Mr. Lerner's instructions.
Mark Andreessen, a co-founder of Netscape, is often credited with having turned the Web from an academic playground into a mass medium. But just what did Andreessen do? After all, Tim Berners-Lee invented the browser, HTML and URLs. You could even argue that the original browser was superior in some ways, in that it allowed people to write pages of HTML as well as read them.
Historians might take issue with this, but I would argue that Andreessen's greatest idea was allowing for graphics alongside text in web documents. As a text-only medium, the Web was interesting mainly to physicists and other academics, but with the introduction of graphics, it began to appeal to an entirely new segment of the population.
Today, graphics are not just used for decoration, but often stand on their own. Nearly every professional web site now hires one or more graphic artists to design the site—even when the site will deal mainly with text. Some sites would not be possible or even worthwhile were it not for the use of graphics. In some cases, these graphics are dynamically generated, produced by a program, instead of sitting in a static file on disk.
This month, we will look at ways in which we can create such dynamic graphics with CGI programs. We will look at the GD library, which allows us to create arbitrary images, and will quickly move on to creating different kinds of dynamically generated charts and graphs. After looking at some simple examples of such charts, we will examine a more sophisticated example, one which draws its inputs from a relational database.
Writing a CGI program that outputs HTML is not particularly difficult, as we have demonstrated in many previous installments of “At the Forge”. Here, for example, is a simple program that, when invoked, returns some HTML to the user's browser:
#!/usr/bin/perl -wT use strict; use diagnostics; use CGI; use CGI::Carp qw(fatalsToBrowser); # Create an instance of CGI my $query = new CGI; # Send an appropriate MIME header print $query->header(-type => "text/html"); # Send some content print $query->start_html(-title => "This is a test."); print "<H1>Testing!</H1>\n"; print "<P>This is a test.</P>\n"; print $query->end_html;
If we want to return graphics to the user's browser, we must modify the “Content-type” header in the HTTP response, generated with the call to “header”. If we want to generate a GIF, we will have to change our call to header such that it outputs “image/gif” instead. By the same token, we can tell the user's browser that a JPEG (image/jpeg) or PNG (image/png) graphic will be sent. Once we have described the content to the user's browser, we must generate a graphic of this type. How can we do that?
Perl's scalar variables can contain any data we might like. If we were more familiar with the GIF standard, we could stick a GIF into a scalar, then send that value to the user's browser. Of course, most of us are unfamiliar with the intimate details of the GIF standard, which makes this a less than ideal solution. A better idea would be to take advantage of Perl's object-oriented capabilities, using someone else's solution to the same problem.
Sure enough, Lincoln Stein (author of CGI.pm, the standard module for CGI programs) has written and distributed GD.pm. This module, available on CPAN (see “Resources”), gives us access to the popular “gd” libraries for C written by Thomas Boutell.
GD gives your program the ability to draw in a manner similar to popular drawing programs. You can choose from an array of paint brushes, colors and built-in shapes, as well as any fill shapes you have drawn. GD has its own internal drawing format, but as we will see, it supports the conversion of drawn images into GIF format.
A simple program that uses GD, gd-intro.pl, is shown in Listing 1. If you install it in your CGI directory and invoke it from your browser, you should see a blue-filled green square.
As you can see, our program manipulates two objects—an instance of CGI and an instance of GD. Each object handles its own affairs, keeping its nose out of the other object's business. $query, our instance of CGI, neither knows nor cares what sort of data we are receiving from the user or returning to his or her browser. By the same token, $image, our instance of GD, does not know that its output is going to be sent to a browser. Such compartmentalizing of tasks is one reason why objects make programming easier and software more maintainable.
When we create $image, we declare it to be of type GD::Image and to be 100 pixels wide by 100 tall. GD will not warn you if your image is cut off by the boundaries of this declared “canvas”; when I first started to play with GD, I was puzzled by the fact that no output appeared. I finally realized that my image was 100x100, but I was drawing a circle with a 400-pixel diameter. GD dutifully performed the task I requested, which meant that no picture appeared in the end.
After declaring $image, we allocate some colors, using GD's colorAllocate method. Each color is defined as red-green-blue (RGB), with each of those parameters varying between 0 and 255. I find it useful to declare color names within a hash, as with %COLORS in gd-intro.pl, but you may prefer to assign them to individual variables or to work with colorAllocate directly.
Next, we tell $image that it should create GIFs in “interlaced” mode. Interlacing means that instead of drawing every horizontal line of an image, the computer should first draw all the even lines, and then all the odd lines. You can see this in action on an ordinary television set. When TV standards were defined, televisions were unable to draw all the horizontal lines at once. Because of this, your television draws all the odd horizontal lines, followed by the even ones, followed by the odd ones again.
Making a GIF interlaced is not related to the speed of your computer or its ability to display images quickly; rather, it has to do with the speed of a user's connection. If the user has a slow connection, a GIF will load slowly. Making the graphic interlaced allows the user to see the graphic as it loads. Otherwise, the graphic will not be displayed until it is completely loaded, which might take a while.
We also set the “transparent” color, which is the color selected to melt into the background. By setting white as the transparent color, we indicate that anything drawn in white should actually be drawn in the color of the background. Since GD drawings have a white background by default, setting white as the transparent color means our graphic will appear to be floating in the user's browser, rather than set against a white background.
After all this, we can finally draw. We create a rectangle between 20,20 and 80,80, which should fill most of the 100x100 area defined when we created $image. We choose to draw the rectangle in green, using %COLORS, which we defined earlier. Finally, we fill the rectangle with blue by pointing GD to a point inside of the rectangle and asking it to fill the area.
GD has a number of other functions, including the ability to draw polygons, create custom brushes and fill with specified patterns. You can add text labels, which are incorporated into the final graph. You can even save graphics to disk in GD's own format, then load them again and continue to manipulate them before turning them into GIFs.
GD is a wonderful tool for drawing on the Web. With it, you can create all sorts of marvelous things. Most of the web graphics I want to create are charts and graphs based on various types of data. I could use GD to create such graphs, but that would involve too much work.
Luckily, as is often the case with Perl, someone else had this problem and decided to solve it. Martien Verbruggen wrote and distributed the GIFgraph module, which allows us to create different types of charts based on a list of data points. GIFgraph uses GD, but provides us with an object-oriented interface to the new graph. This allows us to think in terms of graphs, styles and shapes—as opposed to GD, which would force us to think in terms of pixels and lines.
GIFgraph is actually a set of modules collected under the single “GIFgraph” name. One module handles bar graphs, another pie charts and so on, for nearly ten different types of graphs.
In Listing 2, for example, we create a simple bar graph, with labels “a”, “b” and “c”, with respective values of 1, 2 and 3. We do this by creating an array, traditionally called @data. Each element of @data is an array reference, with the first element corresponding to the labels. Our program displays results from a single set of data:
my @data = (["a", "b", "c"], [1, 2, 3]);
We could easily compare two sets of data:
my @data = (["a","b","c"], [1,2,3], [4,5,6]);GIFgraph is smart enough to use different colors for different sets of data. So given the above data, it will draw six bars—three each of two colors, with the values 1 and 4 associated with the “a” label, the values 2 and 5 associated with the “b” label, and the values 3 and 6 associated with the “c” label.
Before we can send the output to the user's browser, we must send a MIME type. Because it relies on GD, GIFgraph can produce output in GIF format. We tell the browser what to expect with the following command:
print $query->header(-type => "image/gif");
Now we will create our graph object and send its GIF output to the user's browser:
my $graph = new GIFgraph::bars; print $graph->plot(\@data);Notice how we pass an array reference to @data by prefacing it with a backslash (\@data). Passing @data as a reference ensures it will be handed to the plot method as intended.
In this example, we created a bar chart. What if we want a different kind of chart? We can do that by importing a different Perl module (e.g., GIFgraph::lines instead of GIFgraph::bars) and making $graph an instance of the new type.
Note that calling $graph->plot creates a graph based on @data but does not send it to the user's browser. This method returns the resulting GIF to its caller, allowing us to save it to disk, send it to the user's browser or manipulate the resulting GIF in Perl or external tools. Since the CGI standard mandates all output to STDOUT be sent to the user's browser, we can display the chart on the user's computer by printing the result from this call.
Now that we have seen a simple program that produces a chart, let us look at a slightly more complicated example, one which mirrors some real-world situations. Assume we want to create a graph based on a text file. For example, assume we are implementing part of the reporting function for a web-based voting system. The results of a given election will be placed in a text file, called votes.txt:
Tom 123456 Dick 100000 Harry 20000
The election data is stored in the above file, with the candidate's name and the number of votes he received separated with one or more tab characters. This allows the candidates' names to contain space characters, such as between first and last names.
We could use a bar chart with this data, but it would not be nearly as useful as a pie chart, in which each candidate is given a proportional part of the pie. As you can see in Listing 3, our program vote.pl is not very difficult to create and produces results relatively quickly.
It does this by iterating through each line of votes.txt, using Perl's built-in “split” function to turn a scalar value (the line from votes.txt) into a list value. In this case, we split that line across tabs, putting everything before the tab in $name and everything after the tab in $votes. We then use the “push” function to add these values to @names and @votes, respectively, which are built up with every iteration through votes.txt. If there are four candidates in votes.txt, this loop is executed four times, and @names and @votes each has four elements.
When we exit from the loop, we create @data by inserting references to @names and @votes. As always, the first element of @data is an array reference containing the names. Subsequent elements of @data contain values; in this case, we have only one value, @votes. We create the graph by creating an instance of GIFgraph::pie and then plotting it to the user's browser.
The above example introduced us to the notion of creating a chart based on data stored on disk. While this is certainly the right idea, storing such data in a text file has its drawbacks. It is more common and more useful to put such data in a relational database.
Creating a chart based on a table in a relational database is not very different from creating one based on a text file. The main difference is with the loop we use to iterate over our input data. In vote.pl, we iterated over each line of votes.txt, turning each line of text into a name,value pair, which we then added to @data. When we retrieve information from a database, the information is already split into name,value pairs for us.
Before we can begin to write db-vote.pl (a database version of vote.pl), we must create a table in our database. As usual, I will use MySQL, a “mostly free” database described in Resources. MySQL's syntax is standard enough for most purposes, and most of the following should work with other databases as well.
Relational databases expect to receive input in SQL, the “structured query language”. SQL is not a programming language—so while we can create all sorts of queries to manipulate data in our table, we must embed those queries within a program written in a full programming language. Perl's DBI (“database interface”) module allows us to embed SQL statements inside our Perl programs.
We can create a new table by issuing the following SQL command:
CREATE TABLE Votes ( candidate_name VARCHAR(30), votes_received BIGINT UNSIGNED );
While we could send the above to our database server from within a Perl program, it is more usual to type it directly from within an interactive database client. MySQL comes with an interactive client called mysql which allows you to send queries to the database (and receive responses) without having to embed your statements inside a Perl program.
After you issue the above SQL query, the database server will create a new table, Votes, with two columns. The first column, candidate_name, allows for up to 30 characters. The second column is defined to be a BIGINT UNSIGNED, that is, a large integer. We name this column votes_received.
We will now take a leap of faith and assume that, after the polls close on election night, our database table will magically be filled with appropriate values for each candidate. (In a real application, we would probably design things differently, storing each candidate's name in a second table and perhaps even storing each vote in its own row. We will ignore real-world concerns for the time being, so as to concentrate on how to create a graph with this data.)
Assuming our table has been populated with a list of candidates' names and their votes, how can we rewrite vote.pl so it takes its input from a database? As mentioned above, we will rely on DBI, Perl's database interface, which provides a uniform, object-oriented interface to most popular relational databases. Each database is described in a DBD, or database driver, and is imported automatically when we open a connection.
Opening a connection to the database creates a “database handle” object, traditionally called $dbh. We use this object to create a “statement handle”, traditionally called $sth, with which we send the SQL to the database server. Our query, in this case, is rather simple:
SELECT candidate_name, votes_received FROM Votes
When it executes this query, the database server will return a two-column table to the user—in this particular case, the entire contents of the Votes table. Each row of the table corresponds to a line in the text file votes.txt which we saw earlier.
DBI provides us with a number of methods by which to retrieve data from $sth. The most commonly used methods retrieve a row as an array, either in its usual form (using $sth->fetchrow_array) or as a reference (using $sth->fetchrow_arrayref). While the arrayref method is more efficient, beginning Perl programmers often prefer to avoid references, which sometimes confuse them. In both cases, the order of elements in the returned list is determined by the order in which columns were named in the query.
The rest of db-vote.pl (see Listing 4) continues in almost the same way as vote.pl, pushing the values in each row onto @names and @values, then using those to create @data.
It is generally preferable to put such information in a database, because of the reliability and flexibility offered by relational databases. Remember, though, there is no free lunch: a relational database is inherently much slower than a flat ASCII text file. Moreover, our CGI program opens a connection to the database each time it is invoked, an expensive and time-consuming operation. For these reasons, vote.pl will almost certainly execute faster than db-vote.pl. Whether this is an appropriate trade-off depends on the number of visitors to your site, as well as the nature of your web applications.
Now that you have seen how to create simple graphs based on various inputs, let us spend a few moments discussing how you can modify the outputs. GIFgraph allows you to change just about every aspect of the graph, including the colors, placement and style of the legend, and the way in which the axes are marked. This is done with the set method. Of course, certain settings are active for only certain types of graphs; for instance, there are no axes on a pie chart, meaning that setting the axis labels will be meaningless.
Here is one example invocation of set:
$graph->set(x_label => "Candidates", y_label => "Number of votes", title => "Voting results", logo => "corplogo.gif", zero_axis => 1);
The GIFgraph manual page, available by typing perldoc GIFgraph after installing the package, describes these and many other options in detail. However, the above is probably a good starting point and demonstrates how the various factors describing a chart can be set. In the above example, we label the X axis Candidates, the Y axis Number of votes, include our corporate logo on the chart, and ensure the axes will always begin at the origin (0, 0). There are also options to choose colors and fonts, as well as define how often ticks should appear on each axis—if you read the manual, you will likely be overwhelmed by the wealth of options.
As you can see, it is not particularly difficult to create graphics on the fly from within our CGI program. Even more impressive—as well as generally useful—is the ability to create many types of charts and graphs in only a few lines of code.
Next month, we will take a further look at dynamically generated graphics, looking at a simple application that tracks a user's stock portfolio. That application will revisit two topics we discussed last month, namely HTTP cookies and saving state to a database.