LJ Archive

The Quick Road to an Intranet Web Server

Russell C. Pavlicek

Issue #55, November 1998

Apache and Linux make the task simple.

On a recent project, I was busily working on my Linux workstation while a nearby coworker was attempting to install a well-known web server on a popular operating system. As the day wore on, his frustration grew. By the end of the day, he had most of the web server functioning, but was still experiencing difficulty with configuring certain portions of it.

Here was a man of proven technical abilities fighting to set up an Intranet web server for a project. It obviously wasn't a simple task. Through dogged determination, my friend eventually prevailed, but not without gaining a few gray hairs in the process.

I began to reflect on my own experience with web servers. The unobtrusive Linux workstation I was using (a 486/66 with 16 MB of memory and a 0.5 GB SCSI drive) was equipped with the Apache web server. Six months earlier, I had built this system from unused parts lying around our lab and installed Red Hat 4.2 on it. When I built the machine, it was my hope to demonstrate to my coworkers how impressive and substantial Linux had become.

What I hadn't expected was to find myself even more impressed with Linux and Apache as a result of my friend's experience.

When I installed Red Hat 4.2 on the machine, I simply specified that I wanted the Apache web server installed along with the other packages I selected. When the installation was complete, I simply aimed a browser at the newly installed machine and found a friendly little Red Hat-supplied web page staring at me, telling me which file to edit to begin loading content into my new web site.

No fussing with parameters or complex configuration was needed. I just began editing the default HTML file, and soon had a neat web site exemplifying the benefits of Linux.

Why an Intranet Web Server?

As corporations worldwide try to catch the Internet wave, more and more companies realize the need to build a decent Intranet in order to share information within their organization. Traditional paper documents, such as policy manuals, software documentation, design specifications, press releases and corporate announcements, are suddenly more accessible within the organization via an Intranet web server. New capabilities such as support discussions, document searches and audio and video archives are creating opportunities to increase the availability of information and to add to an organization's competitive advantage.

At the center of many of these technologies is the need for a robust and efficient Intranet web server. Yet even a casual glance at the commercial software solutions employed for developing an effective server will reveal that constructing one can be an expensive proposition.

Here is a place where Linux and Apache shine! The combination of the robust Linux operating system with the industry-leading Apache web server (see Resources) creates a flexible foundation for a low-cost, highly functional Intranet web site. Using Linux to solve a business problem is one of the best methods of convincing people that this operating system can hold its own in the corporate arena.

Of course, Linux and Apache can also be configured as a highly effective Internet web server. An Internet server requires all the elements of a good Intranet server, plus a good deal more. In particular, security and performance are usually much more critical in an Internet scenario. However, the issue of installing a web server into a less hostile internal network can be handled with ease, especially if you are using a Linux distribution which does most of the work for you.

What if your Linux distribution doesn't come with Apache preconfigured? The good news is that it doesn't take much to install Apache on most any Linux machine. It is possible to set up a working web server on your Intranet in just a few minutes without having to be a technical wizard.

How to Install

Installation of Apache is a snap using Red Hat 4.2, Debian 1.3.1, or OpenLinux Base 1.x. The Debian distribution goes through a short configuration dialogue (if in doubt, just press the ENTER key a few times), while the Red Hat and OpenLinux distributions have a more or less preconfigured installation. In all three distributions, the process of creating a working Intranet server takes neither experience nor a significant amount of time.

If your distribution doesn't have an Apache package, you can always obtain a source kit from http://www.apache.org/. Here's an example (using Apache 1.2.5) of the commands used to quickly unpack and build the program:

tar xzf apache_1.2.5.tar.gz
cd apache_1.2.5/src/
./Configure

If you have a different location to store the configuration files other than the default location of /usr/local/etc/apache/conf/, simply change the following line in src/httpd.h:

#define HTTPD_ROOT "/etc/httpd"
Note that the definition of HTTPD_ROOT does not include a closing slash. If you don't make the change in the file, but still want to move the configuration files, you will be able to specify the location of httpd.conf at startup time by using the -f option on the httpd command line.

Regardless of whether or not you decide to make the change above, the final step is to compile the web server. Just type make and the httpd executable will be created in the /src subdirectory. The source kit contains additional instructions in README and src/INSTALL should you wish to do anything fancy. A vanilla compile should work just fine for most situations.

How to Configure

Apache has many wonderful configuration options available. To someone who has never managed a web server, the list of options might seem quite daunting. However, it is possible to employ a simple cookbook method to get your web server up and running in short order. The following values will produce a working Intranet server which should perform marvelously in most organizations. Of course, if you have special security requirements, a review of the Apache documentation is quite helpful.

Any Linux distribution with an easy installation of Apache will likely have a reasonable set of parameters already enabled. Even if your web server is live seconds after installation, you may want to review the configuration files just to find out what features have been enabled and disabled by default.

The Apache configuration generally has four files of note. In Red Hat, these are found in /etc/httpd/conf/. In Debian, they are located in /etc/apache/. If you built Apache from the sources as described above, they should be in /usr/local/etc/apache/conf/. The four files are access.conf, httpd.conf, srm.conf and mime.types.

If the files are not found in your kit, you should be able to locate the same group of files with names ending in .conf-dist. Simply copy each of these files to the appropriate location (e.g., /etc/httpd/conf/ if you are using Red Hat) using the .conf file type, and edit the files so that they contain the parameters described below. Note that the Apache-supplied configuration files contain useful hints not shown here for the sake of space, as well as many additional parameters not discussed here. So, it is best to edit the files to contain the following parameters, rather than attempting to construct new configuration files from scratch.

Of the four files, mime.types is the least likely to require modifications. It is a table which associates MIME headers with file types. For example, the table will say that a file ending with .gz should generate a MIME header of application/x-gzip.

The most likely candidate for parameter adjustment is the access configuration definition. In access.conf, try using the following values:

<Directory /home/httpd/html>
Options Indexes Includes ExecCGI FollowSymLinks
AllowOverride None
order allow,deny
allow from all
</Directory>

In this example taken from a modified Red Hat file, this entry says that the main HTML files for this server will be stored in /home/httpd/html. The options include:

  • Indexes: if the user specifies a URL that points to a directory name rather than a file name and that directory does not contain an index file (such as index.html), Apache will display a list of the files contained in the specified directory. If you don't want this behavior, simply omit the “Indexes” keyword from the “Options” line.

  • Includes: this allows the server to include files as directed.

  • ExecCGI: if the URL specified is actually a CGI script, this will allow that script to execute. If you are not interested in CGI scripts, don't include the keyword.

  • FollowSymLinks: Let's say that I've created a symbolic link to my CD drive in this directory. This keyword will instruct Apache to allow access to that CD as if it were a subdirectory of this directory. This facility can be quite handy if you wish to serve up the contents of a CD or allow access to the HOWTOs which normally reside in a tree such as /usr/doc. One quick symbolic link, and you can allow Apache to serve those files without moving them or creating individual symbolic links. But remember to create symbolic links with care, or else you might find that your web site is serving up documents you never intended to be universally available.

In the above example, the remaining directives translate (roughly) to “don't override the normal access rules”, “evaluate rules to allow access before the rules to deny access” and “allow access from all hosts”. Together, these rules add up to “if it's there and they ask nicely, give it to them”.

The CGI directory entry will generally look more minimal, like:

<Directory /home/httpd/cgi-bin>
AllowOverride None
Options None
</Directory>

The parameters that define the operational parameters for the Apache daemon in httpd.conf are as follows:

HostnameLookups on
If HostnameLookups is “on”, the server will use DNS to try to determine the user's host name for logging purposes.
User nobody
Group nobody
User and Group determine the access privileges of the remote user. The server will act as if it were a job created by the specified user and group (in this case, “nobody”).
ServerAdmin root@localhost
ServerAdmin sets the user name and host to receive mail messages that might be generated by the daemon.
ServerRoot /etc/httpd
ServerRoot sets the base directory for configuration and log files.
ErrorLog logs/error_log
TransferLog logs/access_log
RefererLog logs/referer_log
AgentLog logs/agent_log
These parameters specify the name and location of various log files. The directories specified are relative to the ServerRoot directory. Once you've run Apache for awhile, examine these log files. In them, you'll find information about which pages were accessed when and by whom. You'll even find information about the page that called your page and the type of browser the user was employing to look at your site.
MinSpareServers 5
MaxSpareServers 10
StartServers 5
MaxClients 150
These parameters deal with the fact that Apache generates child processes before they are needed in order to deal with any sudden increase in incoming traffic. These parameters specify the minimum and maximum number of unused children which should exist at any point in time. They also specify the absolute minimum and maximum number of children which should be available. Using the command ps ax will reveal the multiple children which the Apache daemon is currently using.

The file srm.conf contains many different items. It deals largely with the name space of the server as well as how the server responds to requests. Of special interest are the following lines:

DocumentRoot /home/httpd/html
Alias /icons/ /home/httpd/icons/
ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
UserDir public_html

DocumentRoot specifies the top of the directory tree that contains the pages to be served. Alias allows the specified directory to be accessed by a pseudonym, even though it is not under the DocumentRoot tree. ScriptAlias is like Alias except that the directory will contain CGI scripts. UserDir specifies the user's subdirectory to look in for any URL that uses a /~username/ specification. For example, if the user's default directory is normally /home/username/, then the user's default HTML directory will be /home/username/public_html/.

Conclusion

The combination of Linux and the Apache web server is a quick and inexpensive way to build a robust Intranet web server. Despite its multiplicity of options, Apache can be quickly configured to serve the needs of most organizations. In fact, most current Linux distributions come with Apache preconfigured and ready for action. If your needs should become more complex, Apache can grow with you to do the job.

Resources

Russell C. Pavlicek is employed by Digital Equipment Corporation as a software consultant serving U.S. Federal Government customers in the Washington, D.C. area. He lives with his lovely wife and wonderful children in rural Maryland where they serve Yeshua and surround themselves with a variety of furry creatures. In his miniscule amounts of spare time, he continues to develop the Corporate Linux Advocate home page at http://www.geocities.com/SiliconValley/Haven/6087/. His opinions are entirely his own (but he will allow you to adopt one or two if you ask nicely). He can be reached at pavlicek@altavista.net.

LJ Archive