Discover the advantages and disadvantages to the four main server deployment strategies in practice today.
When I write this column, I try to stick to specific hacks or tips you can use to make life with Linux a little easier. Usually, I describe with pretty specific detail how to accomplish a particular task including command-line and configuration file examples. This time, however, I take a step off this tried-and-true path of tech tips and instead talk about more-general, high-level concepts, strategies and, frankly, personal opinions about systems administration.
In this article, I discuss the current state of the art when it comes to deploying servers. Through the years, the ways that sysadmins have installed and configured servers has changed as they have looked for ways to make their jobs easier. Each change has brought improvements based on lessons learned from the past but also new flaws of its own. Here, I identify a few different generations of server deployment strategies and talk about what I feel are the best practices for sysadmins.
In the beginning, servers were configured completely by hand. When needing a Web server, for instance, first a sysadmin would go through a Linux OS install one question at a time. When it came to partitioning, the sysadmin would labor over just how many partitions there should be and how much space /, /home, /var, /usr and /boot truly would need for this specific application. Once the OS was installed, the sysadmin either would download and install Apache packages via the distribution's package manager (if feeling lazy) or more likely would download the latest stable version of the source code and run through the ./configure; make; make install dance with custom compile-time options. Once all of the software was installed, the sysadmin would pore over every configuration file and tweak and tune each option to order.
Even the server's hostname was labored over with names chosen specifically to suit this server's particular personality (although it probably was named after some Greek or Roman god at some point in the sysadmin's career—sysadmins seem to love that naming scheme). In the end, you would have a very custom, highly optimized, tweaked and tuned server that was more like a pet to the sysadmin who created it than a machine. This server was truly a unique snowflake, and a year down the road, when you wanted a second server just like it, you might be able to get close if the original sysadmin was still there (and if he or she could remember everything done to the server during the past year); otherwise, the poor sysadmin who came next got to play detective. Worse, if that server ever died, you had to hope there were good backups, or there was no telling how long it would take to build a replacement.
The fact is, plenty of sysadmins still deploy servers this way today, and that's fine if you are responsible for only a handful of servers, or if your company can afford one administrator for every ten servers or so (the old recommendation many years ago). For the most part though, administrators have moved on from configuring servers completely by hand to one of the following three generations of server deployment automation.
Sysadmins started to realize that deploying servers completely by hand wasn't sustainable for large numbers of servers, especially if you needed multiple servers of a certain type. In response, administrators would go through all of the steps lovingly to craft a new server from scratch, then once that work was done, they would create a complete disk image of that server and lock in its fresh install state. When they needed another server just like it, they simply would apply that image to the new hardware using software like Ghost or even dd, then go in and change a few of the server-specific settings like hostname and network information (maybe by a script if they wanted to automate it even further), and the server would be ready. Instead of days or weeks to deploy a server, they could have this server up and running in a few hours. When sysadmins wanted a Web server, they would just locate and apply the Web server image they created before on top of bare metal, and in an hour or so in many cases, they would have a new functioning Web server.
The problem with images ultimately became the maintenance. Whenever you decided to upgrade the software on your servers, you were faced with a dilemma: either go through the painful steps to create a new image with the upgraded software or deploy the old image and run through any software upgrades by hand afterward. Either way, you still had to figure out what to do with existing servers in the field. Do you re-image them with an updated image and go through the hassle of backing up and restoring any unique data made after the image or do you manually apply the changes you just made to your image? In addition, you might face two servers that were mostly the same but had enough differences that they justified having different images, and eventually you found yourself maintaining an ever-growing library of large disk images even though they all may share 90% of the same software.
In response to all of the hassles with maintaining server images, some administrators realized they could bypass the pain of regenerating disk images due to the fact that they were installing the same base OS to all of their machines and only afterward were they applying any specific changes. It was out of this realization that this next generation—the automated install with the post-install script—was born.
With an automated install (like kickstart for Red Hat-based distros or preseeding for Debian-based distros), administrators could create a configuration file with all of those install-time options they used to pick by hand and then feed it to the installer at the boot time, go get some coffee, and when they returned, the server went through the complete install without them. If administrators wanted a Web server, they would just select the installer configuration file for Web servers that would list a set of distribution packages including Web server software for the installer to select and install automatically.
Of course, an automated installer generally just left you with a base OS with some extra packages installed but left unconfigured. The real magic in these automated installers was in their post-install script. Simply stated, the post-install script was a shell script the installer would execute on the system after the base install was complete. What the post-install script became was an automation dream for sysadmins. If you could describe all of the commands and configuration file changes you wanted to make to a system inside a shell script, you could put it in a post-install script and have a completely automated server install.
The benefits to post-install scripts compared to images became apparent pretty quickly. Whenever you wanted to change the installer, all you had to do was change either the installer config file or your post-install script—there was no image to regenerate. These files were text and took up very little space on your disk. The files were easy to change, although unlike with images, when you changed a post-install script, usually you would need to run through a complete automated install to make sure you didn't introduce a bug.
The fact is, automated installs customized with post-install scripts can be an effective way to automate server deployments, and it's a method that's still in wide use today. That said, it isn't without its own problems. The main problem with the post-install script method is that the automation stops the moment the server is originally created. Any improvements you make to your Web server post-install script will help only any new servers—any servers created before those improvements will be different. You will be faced with the dilemma of trying to back-port improvements to your existing servers or completely rebuilding them based on the new install scripts. Although it's easier just to try to apply any improvements to existing servers, you never will be confident that the server you set up six months ago and the server you set up today are identical. At one point, what I did to try to resolve this dilemma was put all of my configuration file changes into packages I would put on a local package repository and then install on any relevant servers.
The final generation of server deployment attempts to address the main problem with post-install scripts: any changes to the configuration apply only to newly installed servers; therefore, new and old servers tend to fall out of sync with each other. To solve that problem, administrators now are turning to configuration management systems like Puppet and Chef. With centralized configuration management, any changes you need to make are made on the configuration management server and then deployed to all relevant servers, whether they have been around for a year or were just created today. As long as you make your changes through the central server, you can be confident your servers' configurations are identical.
With centralized configuration management, automated installs and post-install scripts aren't thrown away, they just become more generic. Instead of all configuration being done via a post-install script, the automated install just installs the bare essentials for the operating system, and the post-install script just does whatever it needs to do so the configuration management software can check in. The configuration management system takes over from there and makes any changes it needs to make including package installs and configuration file changes to make the server ready for use. Because you can be more confident that a new server will match an old one, you end up being less fearful about any individual server going down—after all, why worry if you can re-create it in a few minutes?
Hopefully this article has given you some ideas for ways to improve your server deployment strategies or otherwise has validated the server deployment decisions you've already made. Just be careful; this automation is powerful stuff, and if you aren't careful, you may go into work one day to find you've replaced yourself with a shell script.