Dave explains how to add contextual memory to a script so it can keep track of trends in its environment. Then he explores why that's harder if the script stops and starts over time.
It's been months, and I'm still dealing with a DDOS (distributed denial of service) attack on my server—an attack that I can see is coming from China, but there's not really much I can do about it other than try to tweak firewall settings and so on.
In my last article, I started describing analyzing Apache log files and system settings to try to create some scripts that can watch for DDOS attacks and flag them before that inevitable e-mail from a customer about the site being down or incredibly slow.
Honestly, it's pretty disheartening dealing with these anonymous attacks, but that's another column. If you've had experience with this sort of long-term problem, I'd be interested in hearing about how you finally resolved it.
The first script I worked on simply tracks how many processes are running on the server. It's just a small step from monitoring the output of the uptime program, which, of course, could be done as easily as:
load="$(uptime | cut -d\ -f14 | cut -d. -f1)" if [ $load -gt 4 ] ; then echo "Uptime greater than threshold. What's going on?" uptime fi
That's marginally interesting, but let's go back to analyzing the number of processes running, as that's a better indication that there might be a DDOS happening. Last time, I used a loop of ps ; sleep and kept track of the min/max values. After running for a while, the script's output was something like this:
Current ps count=90: min=76, max=150, tally=70 and average=107
Interpret this as after running for 90 cycles, the minimum number of Apache (httpd) processes running at any given time was 76, max was 150 and average was 107.
Once you've run this script for a few hours, you'll have a pretty good idea of typical traffic load on your server, which is critical for you to be able to detect out-of-pattern variations. If my average Apache process count is 107 and the server has 917 processes, there's a problem. On the other hand, if the average load is 325 processes, 600 isn't too far out of band and could be representative of a spike in traffic rather than the beginnings of an actual attack.
I wrapped up last month's article by showing a script suitable for running from cron that would look for abnormal traffic spikes. But it had some problems, as I highlighted at the end of the column, my nerd version of a cliffhanger—no murder unsolved, no car in the driveway, no police sirens, just some code that needs improvement. Hey, work with me here!
The core problem with the script was really a lack of context. That is, it's not the very first time that it sees abnormally high process counts that I want to be notified, but the third or fourth time in a row. And then once it has notified me, I want the script to be smart enough not to notify me again unless things settle down and then jump up again.
With some scripts, this sort of thing can be quite tricky, requiring temporary files and semaphores to ensure that the script doesn't step on another version of itself reading/writing the file simultaneously. It's doable, but you really have to think about worst-case situations, temporarily blocked I/O channels and so on.
Fortunately, that's not the situation here. In fact, you can accomplish the addition of context by adding a couple state variables. Let's start by pulling the monitoring script back in:
#!/bin/sh # DDOS - keep an eye on process count to # detect a blossoming DDOS attack pattern="httpd" max=200 # avoid false positives admin="email@example.com" count="$(ps -C $pattern | wc -l)" if [ $count -gt $max ] ; then echo "Warning: DDOS in process? Current httpd ↪count = $count" | sendmail $admin fi exit 0
Let's use a “repeated” variable and set it to send a notification after the fourth occurrence of too many processes. That's just a matter of changing the conditional statement:
if [ $count -gt $max ] ; then repeated=$(( $repeated + 1 )) if [ $repeated -eq 4 ] ; then echo "Warning: DDOS in process? Current httpd count = $count" | sendmail $admin fi fi
Not too difficult. But, what happens when there's then an iteration where the count isn't greater than the max threshold? That's also easily handled if you're willing to keep redundantly setting repeated to zero. The outer “fi” simply changes to:
else repeated=0 fi
These additions produce both of the desired updates actually, because repeated ensures that it won't notify of a problem until it's happened a few times, and the conditional $repeated -eq 4 rather than, say, $repeated -gt 4, also means that if it's slammed for 15 iterations, you'll still see only one e-mail message.
Finally, visualize a sequence where you get a spike for a while, then it quiets down. Then another spike hits for a few seconds, then it quiets down for an hour or two, and then it spikes again. That scenario will work as desired too, sending e-mail twice, once for each sustained spike, but not for the one-iteration spike in the middle (because in that instance, repeated never gets beyond 1).
If you've been looking closely at the script, you'll have noticed that what appears at first glance to be an innocuous “echo” statement is in fact producing an e-mail message to the administrator. It's the second half of the statement that's doing the heavy lifting:
echo "Warning: DDOS in process? Current httpd count = $count" | sendmail $admin
I'm old-school, so I go straight to the MTA (mail transport agent), but in reality, a better way to do this, an approach that would have the e-mail include a subject line, is to do something more like this:
echo | mail -s "DDOS Warning" $admin
Because sendmail is designed to be used behind the scenes, it lacks the refinement of things like a -s flag to specify the subject of the message.
In fact, you can make this even a bit more elegant by turning the entire command into a variable:
sendmsg="mail -s 'DDOS Warning' $admin"
Then you easily can tack it onto the end of the echo statement:
echo | $sendmsg
What's the biggest problem with this approach? That if there are any errors, the pipe is only for stdout, not stderr, and those errors will be lost.
If you were going to run the script every few minutes from cron, you could use its capability of e-mailing stdout + stderr, but that would involve a more complex contextual tracking solution (as discussed earlier).
Well, I'm off to install this script on my server and see how it does for a while. And you? Your assignment, dear reader, is to shoot in some proposed topics for me to explore in my column. What scripts would you like to see created, doing what?