What's better than three-letter acronyms? Three-letter command-line tools.
There's just something about the fresh start you get with a new job. Both my previous job and my new one began with the opportunity to build a new infrastructure from scratch. In both cases, as is common with startup infrastructure in its early stages, everything was to be built using Amazon Web Services (AWS), specifically using its Elastic Cloud Computing (EC2) infrastructure. Two significant things had changed in the years between the two jobs that led me to a fresh approach to the infrastructure, and those are the things I'm going to explore in this article: Amazon's Virtual Private Cloud (VPC) and the AWS command-line tools.
VPC has been around for some time, and it was around back when I started work on my previous infrastructure, but it was an extra feature. At the time, you defaulted to what Amazon now calls “EC2 Classic”, which meant you essentially were sharing an internal IP space with everyone else on EC2. With VPC, you can define different subnets, including ones that have only non-Internet-routable RFC1918 IP addresses. You get to choose which hosts get external IPs and which don't, and you can define not only what subnet a host is in, but you also even can assign it a specific internal IP if you want. Additionally, you have more control over Security Groups (which allow you to assign firewall rules to groups of hosts), and you can filter not only incoming (ingress) traffic but also egress (outgoing) traffic. You even can add or remove security groups from a host after it has spawned—something just not possible in EC2 Classic.
Today, new Amazon accounts are VPC only. It's clear that Amazon is phasing out EC2 Classic not just by calling it “Classic” and making VPC the default, but also by creating new lower-cost instance types that are available only on VPC. Even though by default the subnets that come with a VPC behave much like on EC2 Classic with public and private IPs, there still are enough differences and potential once you consider private subnets that it forces you to take a fresh approach to how you build things.
It may not come as any surprise at all to my regular readers that when I initially got started with Amazon infrastructure, I avoided the Web interface whenever possible and stuck with command-line tools. Although back then there were a number of different libraries (such as the boto library for Python) that you could use to interact with Amazon's API with software, the main command-line interface for Amazon EC2 was known as EC2 API Tools (or the ec2-api-tools package in Ubuntu). It was Java based and provided a way to perform all of the common functions you might perform on the Web interface via commands named ec2-<some function>. It wasn't so bad once you got used to it. The primary complaint I had with EC2 API tools was how slow it was between issuing a command and getting a response back.
A year or so ago I discovered that Amazon had developed a new Python-based command-line tool called AWS CLI and made a mental note to try it when I had a chance. Since I already had scripts in place that took advantage of the EC2 API tools, and they worked, I didn't have any compelling reason at the time to switch. Now that I was faced with building new scripts for a new infrastructure, however, it seemed as good a time as any to try it out.
I'm going to skip the details of how to install and configure AWS CLI since that's already well documented on the main page for Amazon CLI, and it's probably already packaged for your Linux distribution (and if not, you can do pip install awscli to get it). Although you can use AWS CLI for a number of different AWS APIs, I'm going to stick to how it compares to EC2 API tools. Instead of every command starting with ec2-, every command starts with aws ec2 followed by the command. Fortunately, most of the commands match in both environments, so where before you might have typed:
$ ec2-describe-instances
Now you type:
$ aws ec2 describe-instances
Unfortunately, the similarities pretty much end there. Where with ec2-describe-instances you could just append a list of IDs, with AWS CLI, you need to use the --instance-ids argument first:
$ aws ec2 describe-instances --instance-ids i-someid
Another difference between the two tools is the output. You can select between a table, text and JSON output with AWS CLI; however, the text output ends up being quite different from the format in EC2 API tools. This means if, like me, you have tools that parse through that output, you'll need to rework them.
For instance, I wrote a wrapper script to spawn new EC2 hosts that captures all of the common options I might pass along the command line based on a host's name. One check I perform before I spawn a host though is to test whether I already have an instance tagged with that host's name. So something like this:
ec2-describe-tags --region $REGION | cut -f5 | ↪egrep -q "^$MYHOST$"
becomes something like this:
aws ec2 describe-tags --region $REGION | grep Name | ↪grep instance | cut -f5 | egrep -q "^$MYHOST$"
I had to make another change to my script when it came to assigning security groups. With EC2 Classic, you could refer to security groups by name when spawning a host by passing multiple -g options to ec2-run-instances. Once you use VPCs though, you must refer to a security group by its ID when spawning a host. What I do is assign all of the security groups that belong to a host to a variable like so:
SEC_GROUPS='default foo'
Then I iterate through that list to pull out the IDs:
for g in ${SEC_GROUPS}; do SGID=$(aws ec2 describe-security-groups --filters ↪"Name=group-name,Values=$g" | grep ^SECURITYGROUPS | cut -f3) SGIDS="$SGIDS $SGID" done
Then I can pass --security-group-ids $SGIDS to my aws ec2 run-instances command.
Another difference I ran across was in handling VPC subnets. With EC2 Classic, you didn't have to worry about subnets at all, and although you can label subnets with names in a VPC, when you spawn a host, it wants you to specify the ID. Again, I put the subnet label into a variable and use the AWS CLI to pull out the ID:
SUBNETID=$(aws ec2 describe-subnets --filters ↪"Name=tag:Name,Values=${SUBNET}" | grep ^SUBNETS | cut -f8)
Back with EC2 API tools, it was easy to get the information about a specific host:
$ ec2-describe-instances | grep myhostname
Or the shorthand form:
$ ec2din | grep myhostname
Unfortunately, things aren't quite as easy with AWS CLI due to how the text output is formatted. Instead, you should use the --filters option to limit your search only to instances that have a Name tag that matches your hostname:
$ aws ec2 describe-instances --filters ↪"Name=tag:Name,Values=myhostname"
I ended up turning the previous command into a quick script called ec2dingrep:
#!/bin/bash HOSTNAME=$1 aws ec2 describe-instances --filters ↪"Name=tag:Name,Values=${HOSTNAME}" ~
All in all, it really took me only part of a day to climb the learning curve of interacting with VPCs with AWS CLI and rewrite the script I use to spawn servers. Although so far it does take me a bit longer to string together the commands, much of that is muscle memory, and the much better speed of AWS CLI over EC2 API tools makes it worth it.