Letters

Server Hardening—ipset:set

Regarding Greg Bledsoe's “Server Hardening” article in the November 2015 issue: I created a modified script for generating ipset blocklists. Namely it creates a set of ipsets, one a hash:net and the other a hash:ip. The script generates a second script called blset.sh, which adds the IP addresses to the ipset hashes. The blset.sh script first adds all the hash:net entries from the various sources, then the hash:ip set is created, but entries are not added if they already exist in the hash:net set.

The new script does not exceed the ipset size limit. The suggested script in Greg's Linux Journal article puts all IPs into a hash:ip, which quickly becomes too large. The script is at https://github.com/zuikway/tlj_blocklist.

—
Wayne Shumaker

Server Hardening, II

Greg Bledsoe missed one small thing that can increase a server's security: reduce the amount of network traffic a server must process:

iptables -t mangle -I PREROUTING -m state --state INVALID -j DROP

INVALID packets are those that must belong to an established connection, yet netfilter has no connection recorded for it. They are “spurious” packets that cannot be delivered, so they should be dropped as early as possible. It isn't worth spending one extra CPU cycle on these packets. Although it won't eliminate the ill effects of a DDoS attack, it can significantly reduce the time the CPU spends handling INVALID packets.

—
Neal

Find Words

Dave Taylor's Work the Shell column in the September–November 2015 issues covers a fun toy program near and dear to my heart. I've been using a very similar de-jumbling algorithm to strengthen my scripting in Perl and Python—although I must admit I haven't been ambitious enough to implement it in bash! It was cool to see Dave use the nearly the same approach I came up with myself. I figured it might be interesting to share my own variation to the same problem.

Considering that modern machines are overkill for most scripts, I started off simply alphabetizing the entire dictionary (first in /usr/share/dict/words, and later a set of professional Scrabble word lists I found on-line). In my language du jour, I construct a massive hash keyed on the alphabetized words, with an array of matching original words as the value. For example:

$list{'abt'} -> ['bat', 'tab']

All in all, this approach takes only a few seconds on a five-year-old laptop, and 21MB of RAM for the data structure.

The next fun part was digging into my computer science background and using a recursive algorithm to deconstruct the input letter sets by calling the same function minus a different letter each time and looking up the result in the hash. Putting the input function into a loop (checking for EOF or “q” for termination) allows you to perform multiple searches against the hash you've spent several busy CPU-seconds constructing.

Keep on hacking!

—
Chandler Wilkerson

Dave Taylor replies: Great to hear from you, Chandler, and glad my column brought you some enjoyment as you realized we'd taken the same algorithmic approach to the word jumble algorithm!

AWS EC2 VPC CLI

Thanks for an excellent journal. I really enjoy it and love the digital version on my Kindle.

The reason I'm writing is just a general hint to Kyle Rankin's great article on the EC2 CLI in the October 2015 issue. I have myself gone through an identical process for exactly the same reasons in changing to the Python CLI. The only thing I chose to do differently in the end was processing the output. I, on occasion, had issues in processing the text output of the Java CLI in that it sometimes changed slightly between versions, forcing me to stick to a certain version or adapt my awk|perl|grep processing of the text output. Text output for the Python CLI was bigger and a bit trickier to parse well—enter JSON output. As Kyle writes, the Python CLI offers the option of different outputs, including JSON. It's a slightly steeper learning curve, but using the JSON output together with the jq JSON command-line parser makes processing anything from the CLI straightforward and keeps me safe from EC2 CLI adding fields or new lines, etc., that may break by text processing! One can always script things prettier, but being a one-liner fan, one can, for example, get all the volume IDs for one's servers:

aws ec2 describe-instances | jq -r
 ↪'.Reservations[].Instances[].BlockDeviceMappings[].Ebs.VolumeId'

Taking it a little further, snapshot every EBS volume, but only if it does not belong to a certain tag (or do it the other way around and snapshot only a given tag) and snapshot only those that are mounted on a given device name:

aws ec2 describe-instances | jq -r '.Reservations[].Instances[] |
 ↪select(contains({Tags: [{Key: "SomeKey",Value: 
 ↪"SomeValue"} ]}) | not) | .BlockDeviceMappings[] | 
 ↪select(contains({DeviceName: "/dev/sda"})) | .Ebs.VolumeId' 
 ↪| parallel aws ec2 create-snapshot
 ↪--description "backup_`date +\%Y\%m\%d`" --volume-id

parallel is a great trick to call the command on every volume ID. I would often use xargs and give multiple IDs in one call, but with the Python CLI, I could give each call only one volume ID. I add the date to the description for a better overview of snapshots and a simple way to monitor and delete given snapshots.

Then, I would also have a similar simple one-liner to clean up old snapshots and monitor that all snapshots are successful.

Keep up the good work!

—
Elfar

Photo of the Month

Mateo from Argentina, already supporting Linux the first day of his life.

—
Gaston