Reuven shares more tips on the usefulness and limitations of three-tiered architecture—and just what is it?
Last month, we began our investigation of three-tiered design for our web applications. By separating the database server from the web application itself by means of a “middleware” object layer, we simplify the logic in our web applications. Furthermore, by adding an abstraction layer between our web application layer and our database layer, we gain the ability to use the same middleware in non-web applications, as well as the possibility of changing the back end without telling the web application.
By the end of last month's column, we had implemented a simple middleware layer that could communicate with the People and Appointments tables we created in a PostgreSQL database. This month, we will briefly look at some web applications we can develop using these objects. You will see that at no time does our web layer directly access the relational database; the SQL is all contained within the objects.
In an ideal universe, we could create the web application layer using any language or technology we might want, communicating with the middleware layer using a universally agreed-to protocol. However, the world is not quite as advanced as we might like, and our choice of an object layer forces our hand when choosing a web application environment.
We created our objects in Perl, so we will need to use Perl to implement our web application. To avoid the overhead associated with CGI programs, and because we can get a great deal more power by tapping into the mod_perl module for Apache, we will use Mason, the Perl-based template and development application environment that we looked into last year. Each Mason component is compiled as necessary into a Perl subroutine, which is then compiled into Perl opcodes. These opcodes are then cached in the mod_perl module inside of Apache, where they can be executed at a much faster rate than would be possible using CGI.
Our first web application example will allow us to add a new person to our database. This will require two Mason components: an HTML form (which could equally well be a static form) and one which attempts to add a new person to the database. In order to accomplish this, we will use the middleware People object, which connects to the database for us and attempts to store a new row in the database. Simple versions of these two components are shown in Listings 1 and 2. These listings are too long to print here; they are available at ftp.linuxjournal.com/pub/lj/listings/issue82. The HTML form (add-person-form.html) submits its name-value pairs to add-person.html. The latter creates an instance, People, then invokes the new_person method to create a new person:
my $success = $people->new_person (first_name => $first_name, last_name => $last_name, country => $country, email => $email);
If $success is true, we know that a new person was added to the database with the arguments that we passed to $people->new_person. Otherwise, we know that the invocation has failed.
However, this is a very crude way of determining whether things have succeeded or failed; rather than present users with an all-or-nothing proposition, it would be nice to tell them what they did wrong so that they can fix the problem. If a hung database process produces the same error message as does an attempt to add a second person with the same e-mail address, it will be hard for anyone to solve the problem.
Thus, the solution is for our web application to check its inputs before passing them to the middleware layer. The more such checks we can insert into our code, and the more application-level error messages we can display, the better.
Our add-person.html component performs two basic checks that demonstrate this: It uses Mason's <%args> section to require that each of the potential arguments has been passed. An HTML form that tries to submit its values to add-person.html must provide each of the listed form elements, or Mason will refuse to honor the request and print a stack trace describing what went wrong. End users won't see this error if they make a mistake filling out the form, but you'll see it if you leave required <input> tags out.
Once our Mason component executes, we can thus be sure that we have at least received the appropriate name-value pairs. But do they contain legal values? In an “unless” statement at the top of add-person.html, we check that we received non-empty values for the four parameters that we will use in our invocation of $people->new_person. If any of them are missing, a message is displayed telling the user what is expected.
To be even safer, we also check that the e-mail address looks relatively valid. The regular expression in Listing 2 will not match all e-mail addresses, but it is good enough for the purposes of this simple example. Users who try to pass an invalid e-mail address are shown an error message that tells them what to change.
Once we can be sure that the values are relatively sane, we can then invoke $people->new_person. Notice how add-person.html manages to do all of this without ever talking directly to the database. DBI is obviously taking an active role in each invocation of $people->new_person, but that happens behind the scenes, and our Mason components don't need to concern themselves with it. This means that if the People object has been thoroughly debugged, there should not be any chance of encountering SQL errors.
Now that we have seen how to add a new person to the database using our People object, let's try a slightly more difficult task: changing a person's first name, using the update_first_name method. (See Listing 3 and Listing 4 at ftp://ftp.linuxjournal.com/pub/lj/listings/issue82/ for examples.) We can only invoke this method once we have already selected an individual, which means that our editing form will have to let us do so.
While it might be tempting to let users select an entry by typing a name or e-mail address into a text field, this is prone to too many errors to be effective. Instead, we will allow users to choose from a <select> list. This removes the possibility that a user will enter an e-mail address (or another defining characteristic) for a person who might not be in our database.
We want to use a unique key to choose the person whose first name will be modified—but at the same time, it seems a bit impersonal to present a list of e-mail addresses. My solution was to go back to the People object (People.pm) and define a new method, get_names_and_addresses (see Listing 5 at ftp://ftp.linuxjournal.com/pub/lj/listings/issue82/). This returns a list of array references, where each array reference contains a name and an e-mail address. The former can be used as a unique key (and as the “value” within an <option> tag), while the latter can be used for display purposes. We can thus iterate over the e-mail addresses and produce a <select> list as follows:
<select name="email"> % # Iterate through the names and addresses, # printing them out % foreach my $info (@names_and_addresses) { <option value ="<% $info->[1] %>"><% $info->[0] %> % } </select>
Allowing users to edit other user attributes would proceed in a similar way. Indeed, so long as you ensure that the user chooses a key that uniquely identifies the user, you can change any and all of its attributes using a similar type of form.
Now that we have seen how we can use the People object to indirectly manipulate our People table in the database, we will start to look into our appointment book, handled by the Appointments object. This object allows us to add an appointment with a particular person on a particular day and time.
In order to accomplish this, we will (once again) need two components. The first component (add-appointment-form.html, in Listing 6 at ftp.linuxjournal.com/pub/lj/listings/issue82) produces an HTML form that allows users to enter a new appointment into the system, choosing a person from a predefined <select> list. (If this were an actual project, I would put the <select> list in a separate component, allowing other components to produce a menu of entries in the address book.) In addition, we have to know when the appointment starts, as well as when it ends. Once again, I prefer to have people select a date and time from <select> lists, since it removes the problems associated with time and date formats.
The following Mason code produces the three <select> lists that we need in order to have the user choose a month, day and year. By defining the @months and @years arrays in advance, we can make the code more readable, as well as update the system for future years quickly and easily:
<select name="begin_month"> % foreach my $month (@months) { <option value="<% $month %>"><% $month %> % } </select> <select name="begin_day"> % foreach my $day (1 .. 31) { <option value="<% $day %>"><% $day %> % } </select> , <select name="begin_year"> % foreach my $year (@years) { <option value="<% $year %>"><% $year %> % } </select>
The second component, add-appointment.html (see Listing 7 at ftp.linuxjournal.com/pub/lj/listings/issue82), allows us to add a new entry into the appointment calendar. It checks (using an <%args> section) that we have submitted all of the required name-value fields from add-appointment-form.html. We then issue the same kinds of basic checks that our other components have done.
Now that we have demonstrated how easy it is to create a three-tier web application, it's time to consider how many tiers we're really using. Does the term “three-tier” really apply here?
The term “three-tiered architecture” grew out of a dissatisfaction with another popular architecture known as “client/server”. For example, databases and web servers are both examples of modern client/server systems. Just as a client/server system typically refers to two physical computers, a three-tiered system refers to three physical computers, with each tier residing on a separate machine.
By contrast, the simple three-tiered application we have examined certainly had three layers in that there were distinct software systems that had clear goals and APIs and made it possible for the application and database layers to speak through a common middleware layer. At the same time, at least two of these layers (the web application and the middleware objects) were on the same computer without any real possibility for separation. If the web application were to become swamped with traffic, we could certainly add one or more identical Apache servers—but there is no way to put the application layer on one computer and the middleware objects on another.
So while I believe that we have now demonstrated some of the advantages of three tiers from the perspective of an application developer needing standard APIs, we have not seen a true implementation of such a system. In order to do so, however, we must have the ability to perform remote procedure calls (RPCs), such that a web application on one computer can invoke a subroutine or object method on another computer. This is possible and is getting increasingly easy with the growth of SOAP (Simple Object Access Protocol), but it brings with it a number of other problems and caveats, including the need to learn yet another transmission protocol.
While we're reconsidering how to define tiers—perhaps we should consider that the application developed does have three tiers, but that they are defined in a different way than we have considered until now. Instead of counting the tiers as (a) database, (b) middleware layer and (c) web server, perhaps we should count them as (a) database, (b) web server and (c) web browser? If we think in these terms, then we have indeed created a three-tier architecture—but come to think of it, so has anyone who has ever written a CGI program that talks to a database.
Moreover, we can introduce additional layers of abstraction into the mix here, complicating things further. What about stored procedures, triggers and views created on the relational database? Although not a physical tier, it can certainly make life easier for the person writing either an object layer or an application that accesses the database. Indeed, stored procedures are often better than an object middleware layer, because they execute on the database and are precompiled, making them relatively speedy.
We can also execute code on the web client (i.e., inside of the web browser) using JavaScript. While I generally encourage my clients to avoid JavaScript as much as possible, this buggy, insecure language riddled with cross-platform incompatibilities is the only way to execute programs from within a web browser, rather than on the server.
So, when we have a web application that uses a relational database, stored procedures, an object middleware layer, a web application layer and client-side JavaScript divided between three computers, how many tiers do we have? It's probably still three, but the fact is that it doesn't really matter what you call it. In the end, a decent design that takes into account your project's specifications, including the need for future growth, is the right way to go—regardless of how it jibes with the latest buzzwords and techniques.
Now that we have looked at a very simple three-tier project, it's time to look at some of the problems associated with such a design. I am not saying that three-tier solutions are inherently evil, but neither are they a panacea. Like most solutions, they are appropriate under certain circumstances. In many cases, splitting the design and implementation among several people can be easier when you divide the work into different layers, as we saw with our web-based appointment calendar. One person could write all of the necessary code, but two people could probably do it more quickly and easily, given a well-documented interface between them.
As with any engineering solution, there are always trade-offs. With a three-tiered design, perhaps the most important trade-off is time. Such an architecture takes longer to specify and design, even if it will eventually be more robust, easier to write, easier to test and easier to divide among a number of programmers.
While dividing a project into many parts might make it easier to specify and test each part, it makes integration testing all the more important and difficult. If everyone sticks to the published and agreed-to API, such testing does not have to be terribly difficult. But there are always differences between the specification and the implementation, and integration testing tends to bring these out. The more tiers in a project, the more important and difficult such testing can be.
Finally, it can be difficult and frustrating to create an object middleware layer that provides an interface to the database. SQL is not a perfect language, but it allows us to express a very large number of queries with a very small number of commands.
Removing SQL from the Mason components and forcing programmers to work with an object API, means that the programmer will be limited to a small subset of the database's power and flexibility. Every time more functionality is needed, the programmer will have to request it be added to the middleware's API. Being able to specify any database query inside of an HTML template (e.g., a Mason component) is a liberating experience for a programmer, and taking that freedom away can be frustrating.
Programmers designing large or complex web applications are finding it increasingly useful to adopt the three-tiered architecture beginning to replace the simpler client/server model that has been favored for the last decade. Indeed, creating three-tiered applications can often make life easier. In the end, however, you will have to decide whether this solution is appropriate for your needs or if it's overkill given your time frame and specifications.