Implementing a scalable cloud service with Scalr

Scale Up

Tools like Scalr and RightScale provide an easy path to a scalable cloud infrastructure.

By Dan Frost

When cloud computing came along, lots of us rushed to start up our Amazon instances and enjoy flexible hosting. Adding extra space wasn't too hard, and firing up extra instances meant we could try new features easily. But ... elastic it is not. Cloud computing is meant to be elastic, which means friction-free scaling up and down. And this scaling should be automatic. The first generation of cloud services, however, treat individual server instances as independent entities, meaning that it always takes some extra effort to expand or contract the infrastructure.

Recent tools like Scalr [1] and RightScale [2] provide a more elastic front end. Unlike the raw APIs and low-level tools provided with Amazon [3], Rackspace [4], and other vendors, Scalr and RightScale let you manage the entire cloud as a single entity independent of individual server instances. Cloud resources are seamlessly added and removed on the basis of demand. According to the Scalr project page, Scalr "... monitors all your servers for crashes, and replaces any that fail. It backs up your data at regular intervals and uses Amazon EBS for database storage. And to make sure you never pay more than you should. Scalr helps by decommissioning services when load subsides." In other words, Scalr lets you view your web infrastructure as a single, virtual system (a farm in Scalr-speak), provisioning resources behind the scenes so your infrastructure is never too big or too small. This capability replaces the home-spun scripts that some customers use for scaling the system, which lets the user concentrate on building better apps rather than managing the cloud. And Scalr doesn't just scale the web server - it even scales the back-end database, which, according to the vendor, is the hardest part about scaling a website.

Scalr is open source [5], but it also comes in a hosted version. If you want to try before you buy, grab a development account [6].

Creating a Farm

Once your account is authorized, you can start by building a farm. To create a farm, select Scalr's Server Farms menu item. Choose an EC2 location for the farm and then click Next. Scalr lets you choose predefined roles that indicate the various activities that will take place on your server farm. On the next screen, browse the available roles and add them to your farm. When you're done, simply save the settings and start up the farm. (Remember that Amazon is going to start charging you money every time an instance starts - even for a few minutes.)

Customizing Vanilla

An instance looks and behaves like a server, but it isn't a server. Instances are transient and should be as "lite" as possible; if the instance crashes, you don't care. The concept is to boot a vanilla virtual machine, then use a series of scripts to install software, grab code, and configure the cloud to run your app. (RightScale calls these scripts RightScripts; Scalr call them scripts.)

The advantage of this approach is that vanilla instances are easy to come by and, if you need to make changes to your codebase or configuration, you can make those changes with scripts, rather than laboriously re-packaging an image and deploying it to all your instances.

To create a simple script on Scalr. Select Scripts and then Add new. Fill in the script name as Tell me your name. Here, use the following script with your own email address:

#!/bin/bash
echo "I am a %role_name%" | mail you@example.com -s "Checking in..."

Click OK to create the script. After the script is saved, choose View All in the scripts menu and then select the Execute item from the Options drop-down on the right of the page. Then, on the execution page, just click Execute to run the script across your entire farm. While you wait for the script to execute, navigate to Logs and the Scripting log. After a few moments, the custom scripts will appear in the log and, if there were any problems sending the email, you'll be able to see them here.

Figure 1: Adding a role to the farm definition.

The next step is to modify the last script so it does something more interesting. Return to the script and modify it to:

#!/bin/bash
apt-get -y install apache2
echo "I am a %role_name%, and I have apache installed now" | mail you@example.com -s "Installed apache"

Now navigate back to the farm, open the Shared roles item in the tree, and choose Application servers. Beneath this, check app64. The tabs presented all relate to this application server role. Open the Scripting tab and then the OnHostUp item in the tree (Figure 2). If it's still running, shut down your farm, start it up again, and wait for each instance to tell you what it is.

Figure 2: Select the script under the `OnHostUp` item in the tree.

The OnHostUp event does what it sounds like - when the host comes up, the script runs. If you have a standard configuration you like to apply, for example,

cat <<-EOF >/etc/apache2/conf.d/our.conf# This is our standard config... don't want to do without it!
EOF

you can apply it at boot time inside a script.

Scripting a Web Server

To get a fully working website running on your web server using a script, take a site you've built, tar it up, and place it somewhere so you can reach it with the wget command. Then create a script:

apt-get -y install apache2
cd /tmp
wget -O your-code.tgz http://www.example.com/your-code.tgz
mkdir extract && cd extract
tar xzf your-code.tgz
mv * /var/www/vhosts/
service apache2 restart

At bootup, attach the service to your farm's app role and restart the farm. Alternatively, you can securely store code with Amazon's S3 service and use the s3cmd command to grab it:

s3cmd get s3://your-bucket/your-code.tgz

Once your code is grabbed and unpacked, you will probably be annoyed; it won't work! Most apps need a little extra configuration and tweaking to work straight out of a tar. A basic check is to see if your app relies on absolute configuration (e.g., paths, the machine's IP address, etc.). Make sure it's happy to run from any location, so if the instance changes, the app still works.

Databases and DNS

Dynamic DNS is a powerful tool for resilient, self-mending clouds. Because a new instance has a new IP address, what happens if the master database crashes? For example, how do all the slaves and all the clients know the IP address of the new master?

The solution is to give your database server a fully qualified domain name (e.g., database.cloud.example.com), but set the DNS A-record to the database's internal IP address. Then, all your slaves and clients can connect to database.cloud.example.com.

Dynamic DNS comes in when the IP address changes. If you kill an instance and start up a new one, you want the A-record to change. You can do this in one line:

curl 'http://www.dnsmadeeasy.com/servlet/updateip?username=myuser&password=mypassword&id=99999999&ip=123.231.123.231'

To do this with, use a service such as DNS Made Easy [7], which I've used in this example. (If you've ever changed DNS via an ISP like Domain Monster or 1&1, this is the equivalent of logging in and updating the A-record, but it can be done programmatically.)

The startup script for a database server is almost like the web server, with the extra line at the bottom:

apt-get -y install mysql-server
#... grab your SQL from somewhere and install it.
service mysql restart
# Now update the IP address for 'database.cluster.example.com
curl 'http://www.dnsmadeeasy.com/servlet/updateip?username=myuser&password=mypassword&id=99999999&ip=123.231.123.231'

The preceding script will install MySQL, grab your SQL code, and then update the IP address so all your web servers can connect to the new instance.

The reassuring thing about this solution is that it means even the failure of a master database server doesn't really matter: If it were to crash, the server farm would only be down for a few minutes.

Proxy, NFS, and More

If your app allows users to upload files, you'll also need to set up an NFS server. As with the database server, you'll need to use Dynamic DNS to update the IP address of your NFS server, but you will also have to run a script across all the client instances to re-mount the NFS server.

Other common roles are a proxy - especially a caching proxy running something like Squid - and a load balancer. Amazon has its own load balancer built into AWS, but with Rackspace, you'd need to create a role that runs a tool like the HAProxy load balancer.

Get Bigger

Scalr lets you define when the cloud scales and how far is will grow (Figure 3). In addition to setting load averages, which means your cloud will grow as the load on your servers increases, you can schedule scaling on the basis of time and day of the week or calls to an API.

Figure 3: Defining scaling options.

With a scalable farm configuration, the next step is to start playing with the scripts. In Scalr, as in RightScale, you can use the same scripts across multiple farms, which means you can try out new things in a test farm before running them across your live site. For example, you might hone the configuration of Apache or apply memcached to your app. Now that you know your farm configuration is reliable, you can play around with the scripts and create new versions. When you're ready to roll them out, you can either start an entirely separate farm or simply terminate the farm and start it again.

The reassuring thing about managing your cloud hosting programmatically is that you escape the trap that many people fall into of assuming and hoping that the servers are running. With programmatic management, you can experiment with different configurations to find the ideal design.

Scalr currently works with Amazon cloud services, but they are reportedly working on a version that supports Rackspace. RightScale supports Amazon and Rackspace.

Pointers for the Cloud

Keep the following tips in mind when you start to build your scalable cloud:

Store and deliver uploaded files on Amazon's S3 service, not on the local filesystem. For example, you could use the http://wordpress.org/extend/plugins/tantan-s3/ plugin to store your WordPress site on Amazon.

Minimize filesystem writes. Either use a database server for persistent data or use another persistent storage device, such as S3 or SimpleDB.

Manage your development environment as you would your production environment; otherwise, you will find yourself tracking down small and annoying differences.

Think through potential problems up front, write scripts to deal with the problems, and test the scripts in advance rather than having to start from scratch in a crisis setting.

Test your infrastructure regularly. With cloud hosting, you can fire up and shut down your entire farm quickly, so make sure you do this regularly. It gives you a lovely warm feeling!

INFO

[1] Scalr: https://www.scalr.net/
[2] RightScale: www.rightscale.com/
[3] Amazon Web Services: http://aws.amazon.com/ec2/
[4] Rackspace Cloud: http://www.rackspacecloud.com/
[5] Download Scalr code: http://code.google.com/p/scalr/
[6] Scalr development login: http://development.scalr.net/
[7] DNS Made Easy: http://www.dnsmadeeasy.com/