Scaling your website with the Perlbal web server

The Juggler

Let the nimble Perlbal web server keep your traffic in balance.

By Frank Wiles

davidmonjou, Fotolia

Are you plagued by too many Apache children eating up your RAM? Do you have a bunch of slow clients accessing your site? Do you need to load balance between several servers? Are you trying to separately route static versus dynamic content? Are you operating without a budget for new hardware? If you answered "Yes" to any of these questions, you probably need Perlbal. Perlbal [1] is a single-threaded, reverse-proxy load balancer and web server. A plugin interface lets you extend Perlbal in interesting ways. Although Perlbal itself is single threaded, internally, it uses asynchronous I/O to achieve its phenomenal performance.

Perlbal was written by Brad Fitzpatrick (of and Memcached fame) to solve several of the problems that are familiar to anyone who has maintained a large website. At some point during the life of a large website, you will run into one or more of these issues:

The first four issues are easy to diagnose, but the last issue, which some of us affectionately call the "Spoon Feeding Problem," is harder to spot. You are probably suffering from it a little now, but it probably isn't noticeable.

The spoon feeding problem originates when none of the browsers visiting your site can download your pages as fast as your server can dish them out. Apache ends up "spoon feeding" the content a few packets at a time to the browser. During this time, the Apache child assigned to the task cannot serve up content for anyone else. You're probably saying to yourself, "Yes, but MY visitors are all on broadband so I don't have this problem."

Think again. Broadband users can still suffer from this problem, especially when you factor in pages that include more than a few images, Javascript files, or CSS files. By default, browsers only download a certain number of included elements (typically four) at any one time, which can easily cause an otherwise speedy client to behave as a slow client. Perlbal addresses the spoon feeding problem and many other performance challenges that could be slowing your website.

Installing and Configuring

You have two options for installing Perlbal: by hand or via the CPAN shell [2]. To install Perlbal from the CPAN shell, su to root ( or use sudo) and run the command cpan. Enter the following two commands:

install IO::AIO
install Perlbal

This sequence will download any other prerequisite CPAN modules along with Perlbal itself. If you haven't used the CPAN shell before, note that the installation process takes you through several configuration steps; in most cases, the default options are sufficient.

Your other option is to install Perlbal by hand. To do so, download the latest tar.gz files for Perlbal, along with the CPAN modules Danga::Socket and IO::AIO, which Perlbal uses to implement asynchronous I/O. You then need to extract the archives with tar -xzf and cd to the directories created. Then it's a simple matter of executing in each directory:

# perl Makefile.PL
# make
# make install

If you need Perlbal to support SSL, download and install the IO::Socket::SSL CPAN module. Perlbal, by default, is configured via the file /etc/perlbal/perlbal.conf. However, you can specify an alternative file in the --config=/path/to/file command-line option.

One of the more interesting features of Perlbal is that it provides a telnet management interface to itself. By connecting to the management port on the local server, you can alter Perlbal's configuration on the fly at run time. This option is useful for doing seamless migrations or to remove a web server from the load balancing pool for maintenance.

The management interface is typically configured in perlbal.conf as follows :

   SET role = management
   SET listen =
 ENABLE mgmt

Once you have this in your configuration file, you can start Perlbal with:

perlbal --daemon

You can access the management interface via Telnet, so you just need to connect to the proper port with:

# telnet localhost 60000

The use of the loopback address ( is intentional; this restriction ensures that the management console is accessible only when connecting from the server. No remote access is allowed.

Note that Perlbal does not give you a prompt when you connect and expects you to just start sending commands. The lack of a prompt has tripped up many new Perlbal users.

Perlbal as a Web Server

I'll start off with the simplest Perlbal setup. Assume you have an entirely static site and you want to avoid using Apache altogether.

For this scenario, you would configure Perbal like this:

  SET role    = web_server
  SET listen  =
  SET docroot = /home/httpd/html
  SET dirindexing    = 1
  SET persist_client = 1

The first two parameters say that you want to run a service that should act like a simple web server and you want it to bind to port 80 on all of the server's configured IP addresses. You may also restrict this service to a specific IP address or a subset of the available addresses.

The docroot directive sets the base location where Perlbal will look for your content. The SET dirindexing directive tell it to perform "directory indexing," so you can view a list of all files in the directory if no index.html exists; the last option turns on keep-alive support.

Virtual Hosts

Often you need to host several domains on a single IP address. To illustrate this configuration, I'll set up two different domains ( and as simple web servers. That configuration is shown in Listing 1.

Listing 1: Multiple Domains on One Address
01   # Include the perlbal virtualhost plugin
02   LOAD vhosts
04   CREATE SERVICE foo_service
05      SET role           = web_server
06      SET docroot        = /home/
07      SET dirindexing    = 1
08   ENABLE foo
10   CREATE SERVICE bar_service
11      SET role           = web_server
12      SET docroot        = /home/
13      SET dirindexing    = 1
14   ENABLE bar
16   CREATE SERVICE virtualhosts
17      SET listen         =
18      SET role           = selector
19      SET plugins        = vhosts
20      SET persist_client = 1
22      VHOST *    = foo_service
23      VHOST *    = bar_service
24   ENABLE virtualhosts

The configuration in Listing 1 first creates a simple web server service for each domain, omitting the listen directives and persist_client options. These options are handled by the third service, the cleverly named virtualhosts service.

The virtualhosts service is configured as a selector, which is Perlbal's term for a virtual service that maps one service onto another. For the purposes of visualization, you can think of the virtualhosts selector service as proxying the connections to the individual foo_service and bar_service, depending on the requested hostname.

If you wanted to host a few websites on one IP address and a few on another, you would just need to set up another selector service to listen on the alternative IP address and map the domain names to their respective web_server services.

Load Balancing with Perlbal

Although you might find some utility in mimicking simple web servers, load balancing is where Perlbal really shines. I'll configure a slightly larger setup with three servers. is where Perlbal will run; and are the servers that hand out files.

If you assume that these hostnames map to the following IPs,

you would create this configuration on as shown in Listing 2.

Listing 2: Load Balancing
01   CREATE POOL apache_pool
02     POOL apache_pool ADD
03     POOL apache_pool ADD
05   CREATE SERVICE balancer
06     SET listen          =
07     SET role            =    reverse_proxy
08     SET pool            =    apache_pool
09     SET persist_client  =    on
10     SET persist_backend =    on
11     SET verify_backend  =    on
12   ENABLE balancer

For this setup to work properly, you will want to point the DNS for your domain to`s IP address ( in this example ). This configuration sets up a reverse_proxy service, which will proxy and load balance requests across both and

The verify_backend option instructs Perlbal to ensure it is talking to an actual web server back-end and not the server's TCP stack. If it determines it is stuck in, say, web2`s listen queue, it will resend the request to web1.

The persist_backend option is useful if Perlbal will be the only client connecting to your back-end pool nodes. It will then use HTTP keep-alives to keep an open connection between Perlbal and the web nodes. Perlbal can also read a list of nodes for a particular pool from a separate text file.

The file is checked every few seconds and any changes are reflected. This feature is configured with:

CREATE POOL foo_nodes_from_file
     SET nodefile = /home/

Management Interface

Suppose is due for some maintenance or has suffered a hardware failure. If you were to Telnet to the management interface and run the command

pool apache_pool REMOVE

that server would then be removed from the pool and all traffic would be routed to until you re-enable it:

pool apache_pool ADD

Other useful commands for the management interface are shown in Table 1.

Also, you can execute any of the configuration commands I've used in other examples on the command-line interface. For example, you could use CREATE POOL to set up a new pool of web servers called new_apaches and then use:

SET balancer pool = new_apaches

Perlbal will switch to using the new back-ends without skipping a beat.

Perlbal Plugins

Perlbal has a plugin system you can use to write your own custom logic. Check the Perlbal source for documentation and examples of writing your own plugins. For example, you could write a plugin that redirects all static content to a particular pool of servers on the basis of the URL.

A friend of mine, Jacob Kaplan-Moss, wrote a plugin for his employer that checks for the existence of a session cookie that would indicate whether the requester was logged into a local news site. If the user isn't logged in, Perlbal directly checks for the existence of a cached copy of the page from memcached and returns it to the client. This essentially removes the web server from the picture for any cached pages requested by users not logged in, which dramatically reduces server load.

The AccessControl plugin lets you allow or deny requests on the basis of IP address or netmask. To use this plugin, just add the line LOAD AccessControl to the top of your configuration file and enter the following:


This would deny access by default but allow anyone in the network. Or if you just need to block a few bad IP addresses, you could reverse this:


which would allow everyone by default, except for the bozo at

The other plugin you might find useful is NotModified. This plugin will return a 304 to the client if the client sends an "If-Modified-Since" header. By returning the 304, Perlbal is telling the browser that, if you have a copy in your cache, it's up to date.

This plugin is useful if you have a page in which your content really doesn't change - for example, an image or video archive. The use of this plugin removes the need for Perlbal to even ask a back-end node about the last modification time.


I hope these examples reveal some ways Perlbal might help you in your quest for better and easier web server performance, load balancing, and reverse proxy service.

[1] Perlbal:
[2] CPAN: