LJ Archive

AlphaMail Is Scalable and Accessible Web Mail

Tony Kay

Issue #160, August 2007

AlphaMail takes a unique approach to providing a Web-based IMAP client.

AlphaMail is a high-performance, feature-rich, open-source Web mail system created at the University of Oregon. The interface includes message snippets in indexes, UTF8 composition and numerous viewers for attachments (such as image icon preview and file listings from tarballs). It also tries to strike a balance between desirable features and too much interface noise. It was created to address several problems that exist with other open-source and commercial Web mail systems.

Performance

The first concern AlphaMail addresses is performance. Almost all Web mail systems (such as Horde's IMP Web mail Client and SquirrelMail) use IMAP from within the Web server, which is incapable of persisting an IMAP session.

The IMAP protocol is designed to optimize access through persistent access, so this is an inherent and recognized problem. The problem is usually mitigated with an IMAP proxy that maintains a persistent connection. The problems with this solution are multifaceted.

One problem is that the code in the Web mail client itself cannot depend on the state of the IMAP connection and must repeat commands as if each mouse click were a new IMAP session. This is a problem, because the sequence of required events for a new session in the IMAP protocol include authenticating and selecting the desired folder. The benchmarks of several IMAP servers indicate that the repetition of the folder selection command, even if the folder is already authenticated and selected (that is, through a proxy), can cause significant extra server load.

These inefficiencies could be addressed through improvements in the IMAP server and proxy algorithms, but another problem is intractable: a proxy cannot improve the protocol. The fact that the Web mail client is using IMAP forces it to behave as a complete standalone client. If the developers want to add a complex feature, such as conversation views (à la Google mail), which requires complex message cross-referencing across several folders, the protocol itself becomes a major impediment.

AlphaMail solves these problems by including a middleware layer that uses a simplified and extensible protocol for the Web application and is responsible for optimizing access to the mail servers. The protocol supports highly specialized commands that allow the Web code to ask for the information needed for a page directly, without having to make any assumptions about the state of the IMAP connection.

This has the additional advantage that the IMAP protocol handling in the middleware layer can be written in a high-performance language (in this case C++), can cache results and can optimize the interaction. The middleware program is known as the imap_webcache, because it caches both data and network connections for the mail interaction.

Accessibility

The next concern was accessibility. Information systems in higher education, government and many other environments must support access for everyone. AlphaMail still is actively being tuned for access by the disabled, but it is already optimized for other access concerns. For example, site security policies might require or recommend that members of the community disable browser features that regularly appear in US-CERT advisories, such as JavaScript. The only Web e-mail I could find that did not require JavaScript for even basic functionality was SquirrelMail, and it was deemed too risky from a performance standpoint.

AlphaMail includes JavaScript enhancements, but those enhancements have traditional CGI alternatives that become active if JavaScript is disabled on the browser. All critical functions of the system will work with any browser, independent of capabilities and settings.

Installation

AlphaMail supports GNU autoconf and has been built cleanly on many Linux variants, FreeBSD, Solaris and Darwin (OS X). The easiest platform to install is Fedora Core 5 via yum(1), as described on the AlphaMail home page.

I highly recommend using Fedora Core 5 on a test machine to avoid the headaches of dependency resolution for your initial trial run. The actual build is pretty much what you'd expect, but solving runtime dependencies can be challenging.

If you decide on a source build, first you need to install the Boost C++ libraries (see Resources).

Installing Boost

You need the Boost Jam utility (usually named bjam) as well as version 1.33 or better of the Boost C++ libraries. Jam is sort of a combination between GNU autoconf and make. Follow the instructions from the Boost Web site for details, but essentially, extract the files and run:

# bjam install

In rare circumstances, you may want to pass options (such as an installation prefix). See the Getting Started guide on the Boost Web site if you have special needs.

Building AlphaMail from Source

The build is very much what you would expect:

# ./configure
# make
# make install

The configure script checks your system for dependencies and tells you what is missing or out of date. You should be able to complete all three steps as long as you have Boost installed, even if Perl dependencies are not met.

Be aware that some versions of g++ have a bug that will cause the compiler to go into an infinite loop while building imap_webcache. Two of the files in the source (IMAPFolder_rules.cc and RFC2822.cc) can trip this bug, but even with a good compiler, these files take a large amount of time and space to build, so expect the compile to run for a few minutes.

On some systems, there can be problems with the location of files. For example, on Red Hat Enterprise Linux systems, Kerberos dependencies cause OpenSSL code to fail to compile, which can be corrected by passing a value for CXXFLAGS:

# CXXFLAGS="-I/usr/kerberos/include" ./configure

As with any other GNU autoconf system, check the config.log if the configuration script fails to complete.

Installing Runtime Dependencies

The biggest difficulty on most distributions is not building AlphaMail, but rather meeting all of the prerequisites. Many distributions come with an older version of mod_perl, libapreq and other Perl modules.

Make sure that you have libapreq2 and mod_perl >= 2.0 installed before messing with the other Perl dependencies, because some of them rely on one or both. Once this is done, you should be able to install the remaining Perl dependencies with your package manager or cpan(1).

You need to use cpan(1) if the packaged versions of a module do not exist or are too old. See the sidebar on avoiding packaging conflicts with your distribution's package manager.

Configuration

The build instructs you to create template configuration files with the alphamail_genconfig utility. This script prompts you for all of the necessary configuration options for a basic installation and creates the necessary files in a location you provide.

You need a special user for a sandbox and knowledge of what user your Web server runs as before starting configuration. I recommend creating a user named sandbox for the former. Logins for this user should not be enabled.

The configuration will ask for an installation prefix, which is whatever you passed to configure. This is usually /usr/local, and the script will verify correctness before continuing.

You will be asked to provide an IMAP prefix and separator for each IMAP server you want to access with AlphaMail. Some IMAP servers use slash (/) for the separator; others use dot (.). The prefix is a subfolder where users put all their other mail folders. For example, if users have shell access and their mail folders are stored in their home directories, it might be policy to put all of them in a directory named mail, in which case the prefix is probably mail.

It is important to note that some Web servers (such as Cyrus) use INBOX for the prefix and dot (.) for the separator. The following procedure can help you determine what to use. First, connect to the IMAP server from the command line, with:

# openssl s_client -connect imap.example.com:993 # For SSL

or:

# telnet imap.example.com 143  # for no SSL

These commands connect you to the IMAP server and allow you to enter protocol commands. Type the following (the numbers are part of the commands):

1 login username password
2 list "" "%"
3 logout

The username and password, of course, should be real user credentials for a typical IMAP account. The responses to the second command should look like this:

* LIST (\HasNoChildren) "." "INBOX.Spam"
* LIST (\HasNoChildren) "." "INBOX.Trash"

which indicates that . is the separator and makes it pretty obvious that INBOX is a common prefix (in this case all entries start with INBOX.).

The prefix parameter is primarily an interface optimization: the interface removes the prefix when displaying most folder names in order to make things more compact. You can hand-edit any of the parameters in the resulting alphamail_config file, which is a commented text file. The entry for defining a pair of typical IMAP servers that serve two mail exchanges looks like this:

imap_servers: example.com=imap.example.com:993[INBOX.], 
↪example.net=imap.example.net:143[/]

The above setting indicates that users should be able to select their mail domain on login (example.com or example.net), and associates these with a corresponding IMAP server, port, prefix and IMAP path separator.

The separator in the brackets is always required, but the prefix is not. The notation [/] means no prefix, with slash as the separator. The IMAP connections will be insecure if you use anything but the SSL alternate port 993.

Attachment viewers and other external programs run in a sandbox that uses a chroot jail, user ID protections and other filesystem restrictions to ensure that a bug in a viewer cannot compromise anything more than the file the user is trying to view, which by definition would be the file containing the exploit. This is where you will use the extra user you created earlier.

The sandbox utility is installed in /usr/local/libexec/sandbox, by default, and is a setuid program. It is important that the permissions of this executable allow execution by the Web server, but it is a security hazard to allow any other user access to the utility. I recommend that AlphaMail be run on a standalone system that serves only Web mail and nothing else, with no shell access for users.

The configuration also asks you to configure the large file-sharing system. This option allows users to upload files to the AlphaMail system, so that others can download them later. Large file sharing is useful when someone needs to send a file that is larger than is allowed or recommended as part of an e-mail message. File sharing has several safeguards to prevent abuse, including terms-of-use agreements, size limits, password protection, encryption, download limits and time-based expirations. Choosing a zero size for the size limit in file sharing disables the feature.

The final step is to edit the Apache configuration. Make sure that mod_perl2 and libapreq2 are loaded with directives, such as:

LoadModule apreq_module modules/mod_apreq2.so
LoadModule perl_module modules/mod_perl.so

And, include the generated alphamail.conf Apache configuration file. For example:

Include /usr/local/etc/alphamail/apache/alphamail.conf

Running AlphaMail

Apache and imap_webcache must be running for AlphaMail to work. Startup order does not matter. A sample Red Hat init script for the Web cache is included and will be installed in /usr/local/share/alphamail/util/init.d.

A garbage collection script must be run periodically from cron. AlphaMail writes numerous files as the mail system operates, most of which are decoded MIME messages and attachments. These files cannot be cleaned reliably by the Web software, as there are no guarantees about user behavior. The script is called garbage_sweeper and is well documented in the Administration Guide.

AlphaMail is in production use at the University of Oregon. The performance and usability results have been very encouraging, and the former are available at the AlphaMail home page.

However, the system is still new, and there are some latent bugs that have yet to be solved. The imap_webcache itself is a rather complicated piece of software that may have occasional problems. As a result, I recommend running an included utility called the hang_detector (in /usr/local/share/alphamail/util by default). You must edit this script before using it, and it requires a valid IMAP user in order to work.

It runs a full query against the Web cache every 15 seconds and is capable of restarting the imap_webcache (via the included init script). It is also capable of sending mail to administrators if desired.

Common Problems

There are several places where you can run into difficulty on a new installation. The Administrator's Guide included with AlphaMail has many tips on how to solve these issues.

One of the most puzzling problems is produced by accessing the service via an incorrect domain name. The cookie that maintains the session is tied to the server domain that you specify during configuration and is stored in a parameter in the AlphaMail apache configuration:

PerlSetVar alphamailDomain server.example.com

If the user is able to access the login page with an unqualified hostname (such as server), the cookie will not be exchanged properly and login will fail without any error whatsoever. The login is actually succeeding, but the browser is not sending the cookie back because of the mismatched URL.

Fortunately, most users connect though a link or type an insecure HTTP URL. The former is never a problem, and a redirect does a nice job of correcting the latter:

<VirtualHost _default_:80>
   RedirectMatch ^/alphamail/?$ https://server.example.com/
   ↪alphamail/index.html
   RedirectMatch ^/alphamail/mail/?$ https://server.example.com/
   ↪alphamail/mail/index.html
</VirtualHost>

Other problems usually involve dependencies, configuration errors, incorrect permissions or missing auxiliary directories. The best indication of these are errors in alphamail_ui.log, which by default is created in /var/log.

SELinux also can be the source of problems. I have not run AlphaMail with SELinux, and unless you are willing to create your own security profile, you probably will need to turn it off or disable enforcing.

The Future of AlphaMail

AlphaMail development has been mostly concerned with performance and core functionality to this point, and I don't expect these two issues to become any less important in the future, as the product is directed at environments that have a large and diverse user community.

It is quite feature-rich, but certainly not all-inclusive. Limiting feature creep is an ongoing requirement, because the critical concern is easy and reliable access to essential functionality, not global coverage of mail client features.

Still, there are plans for significant improvements and additions, such as an internationalized interface, optional Ajax components that would improve client interaction and conversation views.

Tony Kay is the primary developer of AlphaMail and works for the University of Oregon in Eugene, Oregon. He can be reached at tkay@uoregon.edu or tony.kay@gmail.com.

LJ Archive