Log system activities over a TCP network securely by interfacing the existing syslog dæmon with secure shell using simple Perl scripts.
Undoubtedly, one of the most important tasks for keeping a networked system safe is monitoring its activity. Most system programs today communicate with syslogd noting important events (such as an su request). Also, the kernel can note hardware failures and other things on that level. Of course, when monitoring for possible break-ins or trying to track down the path an intruder used to violate a system, the logs can be a precious resource. However, it is evident that an intruder could manipulate system logs if he were able to get the necessary permissions, thus completely fooling any attempt to track down the intrusion, even to the extreme point of making it hard to spot.
A first step toward solving the problem of keeping the system logs on the system itself was made many years ago when the ability to send logging messages over the network to another host was added to syslogd. Sending the needed log parts to another host, which is quite secure (a dedicated machine, for example), was actually a half-solution to the problem. Using a trivial protocol like UDP to transmit data over networks made it very easy to spoof, making it possible for the intruder to add malicious messages or even try to create a denial-of-service attack. Again, all data is sent through the network in clear form, making it simple to sniff, so an intruder could get more details about the system before attempting a break-in.
A solution we developed to solve this problem is having the standard syslogd facility use the secure shell (ssh) package to forward the logs, using authentication and encryption, to another computer. This way, the system we want to keep logged won't require special changes, and the system that will maintain the logs will not require another running syslogd. It will require sshd (secure shell dæmon) running with some normal system tools on a normal user account.
We will present two technically similar but conceptually different methods, with some pros and cons, of performing this important task. The first could be compared to a “push” technology; that is, the machine that wants to log its data on an outside machine will actually force data to go on the other side. The second method will use a “pull” technology, since the remote client will try to retrieve the logs from the desired machine.
Since our solutions use the standard syslog facilities to grab messages, you need a syslog dæmon that supports logging output to named pipes. Most recent syslog dæmons have this (check the man page for support, or try it directly). It is very important that you have a syslogd that opens pipes in non-blocking mode.
Older syslogs used to open the pipes in blocking mode, so if the named pipe was full (in our context, for example, if the connection had been broken for some time), then the whole syslog process stopped while waiting for the pipe to have some space available. This, of course, could have dramatic consequences, since all the logging, not just the remote one, would have been compromised. As a practical test to see whether your syslogd is handling pipes correctly, configure the whole system, then try writing in the pipe until it reaches the maximum dimension, usually 4KB. When the connection is down, check whether syslog is still logging something or if it seems to be frozen. Anyway, we suggest getting the latest version you can find (which should correct other bugs as well) and installing it on your system (at the moment, the latest version is 1.3-31). The more advanced syslog-ng package should work as well with our solution. Apart from this, you will need a Perl interpreter, the tail command from the textutils package, and of course, the secure shell package.
The theory behind this “push” solution is quite straightforward. A Perl script will run in the background, and will connect (using ssh) to the host where we wish to keep the copy of the logs. The Perl script will then monitor a named pipe we will create, and send anything it will get from the pipe over the secure connection to the other side, where we will put everything in a file. Instructing syslog to send the desired logging events to our named pipe will make the work complete.
Let's give a brief technical description of the source code (see Listing1). Once started, the script will open a local log (defined in $local_log) to report failures (this can be disabled by undefining the variable), then it will try to connect to the remote host (defined in $host). The connection subroutine opens the ssh command to the remote side, specifying the user name to be used, whether to use compression and to establish a quiet connection. As you can see, ssh will ask the remote end to execute a cat of a given file, so everything will be stored there for further use. Once connected, the script will open the local named pipe we created, then forward everything to the other side, where the cat command will store it. Of course, if the secure connection should die for some reason (technically, if the print on the file handle should fail), the script will try to establish the connection again.
As a last feature, the script will send a timestamp each defined time interval to the other end, which could be useful for tracking down when the last message actually came from the host if something serious has happened. In fact, the script can detect that the connection went down each time it sends something over the network. By sending a timestamp to the FIFO (i.e., first-in/first-out pipe), it will automatically trigger the check that the connection is alive at that moment.
Even if this should look like the easier step to accomplish, there are a few things you must pay attention to, so that an intruder, who has discovered the remote logging facility, won't be able to use it as a way to break in to the logger machine. First of all, you must generate your key pair on the network machine from which you're sending the logs by using the ssh-keygen utility. You do not have to set a password (just press return twice at the prompt) to this pair of keys (that should be used just for the logging program), so the script will not have to prompt you for it each time it starts.
With ssh2, you'll have to specify that you're going to use this newly created private key as a possible key to identify yourself. Add it to your ~/.ssh2/identification file by adding a new line with the IdKey statement, followed by the key name. (Alternatively, you could force this on the command line to ssh in the script.) Now, you'll have to transfer the public key to the system where you're going to store the logs, and add it to that user's authorized key file. (Usually ~/.ssh/authorized_keys in ssh1, where each line contains a key. In ssh2, copy the key to your ~/.ssh2 directory, then add an entry to your authorization file, ~/.ssh2/authorization by default, in the usual syntax Key key_filename.) This instructs sshd on that machine to authenticate anyone using the private key that couples with that public key without further checks.
Depending on the ssh configuration, you should normally have to log only once using this new generated key. At the first connection, ssh will have to add the key of the new host to your known_hosts files (by default, not an automatic task); it will prompt you to confirm that you would like to add the key. Pay attention to the authorization and key files permissions: they must be readable and writable only to the user—for security reasons first of all, but also because ssh will refuse to accept connections with such badly configured keys. Try logging in verbose mode if any problem occurs, since this will tell you exactly why the connection was not permitted.
A complete login over ssh could be done this way with that key. This would be very dangerous, so finally you must put some important restrictions on the use of the key. First, specify from which hosts this key could be used with the from= directive, so if someone were to steal the private key, he couldn't log on from any place other than the ones you specify. Second, inhibit the allocation of a pty with no-pty. Finally, limit the use of this key to just appending something to the logging file; that is, the command the Perl scripts execute on the remote machine. This is done using the command= directive and should be:
command="cat >>
At this point, you can be quite sure that nothing more than appending to the log file can be done, and you can use your normal account on some machine as a log storer, since this won't open you to any danger (other than having your space filled with logs), if the log key is correctly restricted by ssh. In ssh2, you'll have to add just the command restriction option, using the Command directive in your authorization file. Check out the man page for exact information on the syntax.
Since many sshd installations check the TCP wrappers files before allowing a connection, make sure the hosts from which you wish to connect are allowed to do so. As a user, be sure you don't have any .shosts or .rhosts files from the machine you are logging. If present, everything would seem to work (since authorizations would be done through these, if allowed by the dæmon), but the system would be very insecure.
The first thing we have to do is create the named pipe needed for the communication between the syslogd and our script. This is easy; use the mkfifo command with the full path name of the named pipe as the parameter to create it. Now you must tell syslogd the kind of messages it should send to that pipe, using the standard syslogd syntax:
log_facility: | /
It would be a good idea to send over the network the usual authorization tasks (auth.* and authpriv.*), critical messages (*.crit and *.emerg) and important kernel network-related messages (especially if you're filtering packets or something like that). It is very important to choose all the needed facilities to have a good overview of the system status. This changes, of course, from system to system (the facilities used could also be compilation, and thus distribution-dependent), so be sure to test it thoroughly.
Once all the steps are done, you must start the script, redirecting the standard error and output to /dev/null so ssh errors won't be displayed. Then, once it runs, restart syslogd so it will reread its configuration. If everything has been done correctly, you should now have your logs on the remote machine, and you should put the script in your startup file so it will be started at boot time. The program can run using normal user privileges; it doesn't need any special permission. Just make sure it can read from and write to the named pipe created for the communication with syslogd, and eventually, to the file selected for local error logging.
The second method is based on retrieving the data, a “pull”, instead of receiving it from the syslog. In this solution, a Perl script will run on the machine where we are going to store an additional copy of the logs, which will connect to the machine we want to monitor and obtain the logs from it. This task will be done just by using the tail command on the log file, so everything that comes to it will also be displayed on our side.
This script is quite similar to the first one, as you can see (in Listing 2). It has the usual procedure to connect to the remote host using ssh, the variables that hold the configuration and some local logging capabilities. Since the script is running on the machine that holds the additional copy of the logs, the logs for the script will also be stored on this machine. Once the connection is open, the script will continue reading from the ssh file handle and print everything on standard output, which will very likely be redirected to a file on startup. As we invoked it, ssh will execute tail on the log file, so each time something is added to the log file, it will be sent over. If an error occurs, then the script will try to reset the connection and log the error.
The configuration of ssh is very similar to the previous, but of course the roles are exchanged. In fact, here we will have to connect from the additional logging machine to the machine we want to log, so you will have to create the key on the additional machine and copy the public key to the authorized keys on the other machine. All other restrictions and hints should be used in the same way as before. Remember, you must now set the command to execute to the new one; that is, the tail defined in the Perl script by the $cmd variable.
In this example, you shouldn't have to change anything in the syslog, since you will be using tail on the already-existing log file. Anyway, we strongly suggest creating a new log file (using the usual syntax) including the most important things that need to be stored on the other machine. Another small problem arises from the fact that this consumes disk space. This is quite easily solved, since you can add a job to your crontab that deletes the file from time to time. Remember that syslog must be restarted if the log file has been deleted and recreated using touch, since it must exist for syslog to use it. If you have a recent version of textutils, then tail (with the --retry option) won't care about the file being deleted (use the rm command) and will create and open a new file. If you have an older version that doesn't support this option, you could just kill the tail command (this is easily done with some piping from ps to grep to kill) in the same cron as before, and the Perl script will then restart it from the remote side. Of course, we strongly suggest upgrading the textutils if possible.
Once configured, you should just start the script, redirecting the standard output (and error, too, but this isn't necessary) to the file that will keep your logs. The script on the logging machine doesn't need any special privileges. Of course, the file with the logs must always be readable by the user who is effectively running the tail command executed via ssh in the remote script. This means that if you're going to delete the file, you must set the correct ownership and modes each time in the cron job.
Both ways are effective for remote logging, and they are almost equivalent. We wanted to describe both, since one could be easier to use in some environments. For example, if you don't have sshd running on the machine where you want to store the logs, you could just install the ssh client in your home and use the second solution. Also, we wished to point out some interesting problems that can arise in communicating over a network.
The first method uses a somewhat “real time” logging. Each time a message is generated by syslog, it is sent through the pipe to the other side if possible. So, once something is generated by syslog, it is sent (in normal circumstances, of course) and can't be stopped in any way. This is a great feature for such a system, but causes a general process communication problem: buffering. Since the pipe that serves as a communication link between syslogd and the program was opened in non-blocking mode and has a finite buffer size, if syslog generates data too fast, then the data will be lost. If it could open it in blocking mode, we could solve this problem, but then if the pipe couldn't get free space in a short time, the process would just freeze and lose other data. This is an interesting problem of real-time communication. Practically, it shouldn't be a problem, since if you intelligently select which syslog facilities to send over the network and a decent connection exists between the two machines, then the buffer shouldn't get filled so quickly.
As a way of solving this possible data loss, the second program will read the logs from a disk file, jumping over the buffering problem (the disk file is the growing buffer, so it doesn't have such restricted limitations). Of course, this isn't a “real-time” solution, since tail has to check whether the file changes in given time periods, otherwise it would constantly be using the CPU to check for changes; see the --sleep interval in the info file. So theoretically in this short time period, a malicious cracker should find your program running and kill it (again, very improbable).
As to the security of the account used for the secure shell, the two methods are both secure if the restrictions are set correctly as explained. There shouldn't be a direct way to break from one machine to the other by using the restricted logging access, since nothing but the designed command can be used. Of course, you should have noticed that once a break-in is made on the machine we are logging, the cracker can initiate a denial-of-service attack by trying to fill the disk space on the remote machine. This is an old problem with logging and can be dealt with simply by placing the logs on a non-vital (that is, a non-root) partition or by using quotas if possible. Since the script doesn't need any other special permission to be executed, it is also evident that there aren't any root-privilege process problems.