- Linux Magazine

In the battle for web visitors, milliseconds count. A few simple changes will help your site stay popular.

By James Mohr

tasosk, 123RF

Four seconds might not be a long time, but it seems like forever when you are waiting for a page to load. If you are running a commercial website and the user has the perception your site is slow, you might just lose a customer. With the expectation of fast load times, website administrators know that every second (or fraction) counts. If you look closely at your web server environment, you might just find that half the average wait time is unnecessary. But tuning is often a matter of trade-offs - performance improvements sometimes require sacrifice in other areas. Before you make any changes, be sure you understand the implications.

Web server performance is often measured around three common indicators:

the number of requests served per second;

site throughput in bytes per seconds;

the response latency for each new request (including new connections).

These factors interact in complex ways to produce the performance you experience at your website. It isn't possible to study every possible issue affecting web server performance, but I will highlight some tips for a faster Apache.

The Front Line

The most common Apache default configuration is to use the Multi-Processing Module (MPM), which is a non-threaded, "pre-forking" web server. MPM is useful on sites that need to avoid threading because they use non-thread-safe libraries, as well as where you need to ensure that individual requests do not interfere with each other. By running ps on your system, you will see several processes running a command like /usr/sbin/httpd2-prefork, indicating it is the pre-fork version. Pre-fork means that servers are started before they are actually used.

The big four Apache performance tuning directives are StartServers, MinSpareServers, MaxSpareServers, and MaxClients. The number of spare processes is defined by the values MaxSpareServers and MinSpareServers, and the number to start is defined by StartServers.

Usually you do not need to make changes to these settings because MPM is self-regulating. However, it is important to ensure that you have set the maximum number of clients (MaxClients) high enough to cover the number of concurrent requests. Keep in mind, however, there is a danger of not having enough memory if you set this value too high.

During normal operation, Apache has a single control process that is responsible for starting the child processes that serve the remote requests. Because starting a child process takes time, often it is useful to have a number of them already running and waiting for new connections so visitors don't have to wait for a new process to start. These servers that are running but not currently servicing clients are called spares, and a couple of directives configure them.

The latest versions of Apache have made things a little easier by combining the most common performance-related directives into a single file: /etc/apache2/server-tuning.conf.

If you have a lot of requests, you might be tempted to change the values of these directives. For example, you might want to increase the value of StartServers so that more server processes are running immediately after startup. Apache only starts one server per second, so theoretically it could take a couple of minutes for all of them to get running on startup. If you need to restart Apache often enough that this becomes an issue, you probably have other problems that are affecting your server more than the number of Apache processes. For the most part, the default settings are fine. If you want the wait time for your users to be as short as possible, set this value very high so that as many servers as possible will start immediately.

The MinSpareServers and MaxSpareServers directives define the minimum and maximum spare servers, respectively, that should be running. As connections are created and dropped, Apache will start and stop servers as needed to maintain these values. Here, too, you rarely need to change the default values. If your server needs to handle more than the default 256 simultaneous connections, you might need to increase MaxClients.

Note that if you increase MaxClients, you might need to change ServerLimit as well. ServerLimit defines the maximum number of server processes that can be started during the lifetime of the Apache process. Whereas you can change MaxClients and gracefully restart Apache, you cannot change ServerLimit without actually stopping and restarting the Apache process.

If you are running into problems because you have a lot of simultaneous requests and want to set MaxClients higher, you can do so without stopping Apache completely. However, if you set it to something higher than ServerLimit, MaxClients will be set to ServerLimit.

Memory

The biggest problem you are likely to encounter is not enough RAM. Particularly with web servers for which even slight delays affect the user's "experience," you should avoid swapping at all costs. If half the time of each request is spent loading existing processes back in from swap space, it is definitely time to consider more RAM.

At baseline, take the amount of memory you need without Apache running. (Use a tool like top.) Subtracting this amount from your total memory gives you the approximate amount you have for the Apache processes. Next, monitor the system with Apache running to get an idea of the average amount of memory Apache processes need. Once you know this, divide the amount of memory available for Apache processes by per-process memory to get the approximate maximum value you can set for MaxClients.

Keep in mind that other processes might start up and need some memory, so it is better not to give all of the remaining RAM to the Apache processes. Also, setting this value too low can really kill performance. If all child processes are busy, new connection requests are put in the TCP queue. If the system cannot respond fast enough, the connection will time out.

MaxRequestsPerChild is another two-edged sword. This directive controls the number of requests that each child process will accept before restarting. This limit on the number of requests is a protection mechanism. If your application has problems (i.e., memory leaks), the amount of memory used increases with each request. With no limit, you eventually run out of memory. If the child process is stopped after a predefined number of requests, the memory is freed by the system. Obviously, if you have a lot of requests and some severe memory leaks, you might run into problems more quickly.

Setting MaxRequestsPerChild too low is likely to cause some performance problems, and if you really need to set it low because of memory leaks or other problems, you definitely need to correct those problems first. The default value is 0 (no limit), but a common setting is 10000, which might seem like a lot, but consider that this does not mean the number of pages, but rather the number of requests. On any given page, you might have several images plus CSS or JavaScript files that are treated as separate requests.

The top utility gives you a good idea of how each individual process is behaving. Typically the client processes all run as a specific user, so you could tell top to monitor only processes from the user wwwrun, for example (see Figure 1):

top -U wwwrun

Figure 1: The top utility showing the processes of the user wwwrun.

Keep Alive

The KeepAlive and KeepAliveTimeout directives work closely with the server directives discussed earlier. By turning on KeepAlive, you allow Apache to serve multiple requests for each connection. If KeepAlive is not turned on, the client needs to open up a new TCP connection for each request. Keep in mind that a request is not just a page, but everything on it. So if the page includes CSS files and several images, you can expect delays. If the client has to re-connect for each request, the server would seem a lot slower than it is.

The KeepAliveTimeout directive tells Apache how long to wait if no further requests are made before closing the connection. If this value is set too high and the user ends up reading the contents of a page without loading anything new, the process will sit idle and possibly make other users wait. The standard value is between three and seven seconds. On the other hand, setting this value low can be a good idea if you expect the user to stop and read a large portion of the page before continuing.

Symbolic Links

Performance is often at odds with security. One security threat comes in the form of symbolic links. Although symbolic links simplify administration, you need to be careful about any potential problems.

When an application such as Apache tries to read a file, the work is done by a set of libraries provided by the operating system. The application normally is not aware of whether the file is a real file or a symbolic link. To prevent Apache from inadvertently accessing locations that it shouldn't, the system normally will not follow symbolic links. However, in some cases you might want to follow symbolic links, so you are provided two options - FollowSymlinks and SymlinksIfOwnerMatch - either of which can be set in any Options directive.

As the names imply, FollowSymlinks tells Apache to following the symbolic links it encounters, whereas SymlinksIfOwnerMatch says only follow them if the owner of the source and target are the same.

Loss of performance comes when neither FollowSymlinks or SymlinksIfOwnerMatch are used. Before opening the target file, Apache needs to check whether any of the elements in the path are symbolic links. If so, access is denied. If FollowSymlinks is enabled, this check is not done, and file access is a little bit faster. In the case of SymlinksIfOwnerMatch, an additional check still needs to ensure that the user ID matches in every case. Even if you don't use symbolic links, it is worthwhile to include FollowSymlinks to give yourself a little performance boost.

.htaccess

Most of the web servers I have worked with have been virtual domains of one type or another. Sometimes I have complete root access and can configure things as I see fit. In other cases, I am one of dozens of other domains on a single machine, so the webmasters aren't given full access. To simplify configuration, Apache systems that don't provide the webmaster with full access are often configured to let the webmaster make changes through the .htaccess file in the project directory. Unlike changes made directly in the server or virtual host configuration, changes made to .htaccess are active the very next time the server tries to access something underneath the specified directory. To enable this feature, you need to turn on the AllowOverride option.

By default, Apache checks for a file named .htaccess. (The name of this file is configurable with the AccessFileName directive, but I have never seen a system that used a different name.)

Before serving the files in a directory, Apache will first look for .htaccess if AllowOverride is enabled. This check takes time and is wasteful if you will never have any .htaccess files. To make matters worse, Apache checks in all of the parent directories for .htaccess. Depending on where the file for the web server resides (DocumentRoot), this could be several layers deep.

Typically you don't access files outside of DocumentRoot, so for the directories above it and for files without different options, you can give yourself a performance boost by setting AllowOverride to None. One configuration I often use enables override for the DocumentRoot file of each virtual host - in that the root directory almost always has a .htaccess file - but then disables it for subdirectories.

Faster Pages

Increasing web server performance is only part of the battle. If your pages are improperly designed, making changes to the server configuration could have little or no effect. For example, if you have lots of high-resolution, 5MB images on your site, you can do little to configure your web server to compensate effectively for the performance loss you'll suffer serving up these big files. If each time a page is loaded it needs to load multiple CSS or JavaScript files, you should address this problem before you start tweaking the web server. By combining pages to reduce the number of HTTP requests, you can often decrease the overall load time and give the server the chance to service other visitors.

Another technique that speeds your pages (by decreasing the number of file accesses) includes CSS or JavaScript directly in the HTML file (or in the rendered HTML). Note that you are not gaining anything by rendering the HTML with something like PHP, which loads the CSS or JavaScript from another file, meaning you still have multiple file accesses.

Small changes, like removing unnecessary HTML tags or CSS class definitions, decrease file size, allowing files to load quicker so the page displays faster. It might not be visible to individual visitors, but the performance increase benefits the web in terms of less time spent processing the request, plus the decreased need for serving the same number of pages.

Caching

With the vmstat command, you can monitor how much time is spend waiting for I/O. If you look at the wa column in Figure 2, you see the percentage of time spent waiting for I/O.

Figure 2: The vmstat output shows (among other things) the percentage of time spent waiting for I/O.

If this waiting time remains low while your web server is under heavy load, you don't necessarily have a problem. On the other hand, if you use a tool like awstats to show access statistics, you can get an idea of the average number of files read for each page rendered.

To increase the likelihood that files will be cached, use page expiration. If you use the default, files might be retrieved even when they don't really need to be. However, by increasing the expiry on your pages, you ensure that the page can be cached for a longer period (either by the browser or a proxy).

To do this, you need mod_expire, which is fairly common in many Linux distributions. Of the different directives, ExpiresActive can be included with the server configuration, virtual hosts, directory blocks, and .htaccess files. If included, it applies only to that particular part of your site. For example, you could enable it for the entire site then turn it off for a specific directory by including it in a .htaccess file. The ExpiresDefault directive specifies the default expiration, and you can specify dates on the basis of the date the file was last modified or accessed. Because you can specify the date in a human-readable form,

ExpiresDefault "modification plus 1 week"

would set the expiration date to one week after the file was last modified.

Taking this a step further, the ExpiresByType directive lets you specify an expiration on the basis of MIME type. For example, images typically have a longer expiry than HTML files.

mod_rewrite

One of the most powerful and complicated aspects of Apache is the mod_rewrite module. As its name implies, mod_rewrite provides the ability to rewrite URIs.

Although this feature does not necessarily represent a significant performance problem, for each rule that matches and each URI that must be rewritten, system resources are expended. The key is to try to avoid processing URIs too many times and to keep away from the URIs that you don't need to process.

One way of addressing this problem is to use the [L] option. This option tells Apache that it has reached the last rule it should apply to the given URI. That is, no further rules are processed. Unfortunately this is not as easy as it seems.

If the rewrite rule does in fact cause a redirection, the new URI is itself processed, and everything starts over again. In many cases you can avoid this with the use of re-write conditions (RewriteCond), so that if the URI matches a given condition, you won't need to process it further.

A related option is the [S] or skip flag. This flag will skip over a given number of rewrite rules. For example, if the URI is not a file (e.g., a directory) you can tell Apache to skip over rules that only apply to files. In other words, you don't avoid the rewrite rules completely, but you reduce the number you need to process.

Comments

In your development environment, you might want to have your files loaded with comments, but comment lines are detrimental on live servers. Comments that are structured properly can be removed easily with the use of shell scripts, which means you can comment your files for the testing phase and then remove the comments before you move files to the live server.

Conclusion

The Pareto principle (80:20 rule) applies to web server performance tuning. Implementing many of the concepts described in this article can increase your performance relatively quickly; however, to squeeze out the last bit of advantage, you probably need to spend a lot more time. Depending on the kind of server you run, your mileage will vary.