The sys admin's daily grind: HAProxy

Balancing Stuntman


Charly's candidate today is the HAProxy load balancer, which not only distributes the load between servers, but also helps admins overcome their fears of lengthy configuration files.

By Charly Kühnast

Linux has no lack of free load balancers. Previously, I talked about Pen [1], which you can set up in a couple of minutes, and Pound [2], which is world famous. However, the high flyer in the balancer scene, HAProxy [3], has thus far led a fairly secluded life as an "Unknown Stuntman" [4]. The HAProxy balancer can handle any aspect of the proxy business - yet nobody has heard of it. The program's author, Willy Tarreau, has successfully demonstrated his competence as the maintainer of the 2.4 kernel, but it doesn't look as if userspace software users are interested in taking his offerings.

HAProxy uses a single thread - multithreading is for the timid. Tarreau's single-thread approach offers the decisive benefits of low overhead and high speed but requires meticulous care on the part of the developer. The single thread can go down because of a single error - a memory leak or a race condition, for example - and then the stuntman takes a tumble without a safety net.

Figure 1: An HAProxy distribution statistic: a widely unknown load balancer by Willy Tarreau.

The lack of fame could be due to HAProxy's configuration file, which is far more bulky than its competitors; however, it definitely is not from poor performance. Tarreau's profound knowledge of the Linux kernel is evident in the proxy's development history, and polling provides a great example of this. The earliest versions of polling still used select(), which tends to be fairly lethargic when faced with a larger number of open file descriptors. The function was replaced by calls to poll(), which in turn was ousted by epoll().

The brand new splice() transports data between two interfaces, and this beams HAProxy into the league of two-figure gigabyte peak performers. Tarreau also proudly points to reference applications that permanently shovel between 2 and 3GB through his balancer.

I can find no real reason to be afraid of lengthy configuration files. Although I am allowed to tweak the settings, the basic setup for balancing between two web servers is pleasingly simple, as you can see from Listing 1. Admittedly, this is a simple task for the HAProxy balancer, yet stuntmen are used to hiding their light under a bushel to make the star of the show look better.

Listing 1: /etc/haproxy/haproxy.cfg
01 global
02 maxconn 16000
03 ulimit-n 65536
04
05 user haproxy
06 group haproxy
07
08 daemon
09 nbproc 1
10 pidfile /var/run/haproxy.pid
11
12 listen http 0.0.0.0:80
13 mode http
14 option httplog
15 balance roundrobin
16 server www1 192.168.1.20:80 check
17 server www2 192.168.1.21:80 check
INFO
[1] Pen: http://siag.nu/pen/
[2] Pound: http://www.apsis.ch/pound/
[3] HAProxy: http://haproxy.1wt.eu
[4] Lee Majors, The Unknown Stuntman: http://www.youtube.com/watch?v=-3CXp54h4ew
THE AUTHOR

Charly Kühnast is a Unix operating system administrator at the Data Center in Moers, Germany. His tasks include firewall and DMZ security and availability. He divides his leisure time into hot, wet, and eastern sectors, where he enjoys cooking, fresh water aquariums, and learning Japanese, respectively.