The Power of Tiny initrd

Eduardo Arcusa Les

Issue #263, March 2016

A success story of how a PACS server ends up in a tiny initrd.

Those of you who have tried it will agree with me on the benefits of using an initrd/initramfs as a real root filesystem and of the possibility of not using a hard disk drive (HDD) in your servers. If you haven't tried it, I invite you to continue reading.

If you don't already know, initrd/initramfs is a scheme to load a temporary root filesystem into RAM. initrd and initramfs refer to two different methods of achieving this. Both are commonly used to make preparations before the real root filesystem can be mounted. Years ago, initramfs didn't exist, and everyone used initrd. Today, people continue using initrd.

But at my workplace, instead of using initrd as a temporary root filesystem, we use initrd as the real root filesystem. That's why in many of our servers we don't need an HDD—everything is in RAM. The only data stored in the HDD is that which may vary from boot to boot and needs to be preserved. That's the case of a database in which we put the PostgreSQL/MySQL binaries and libraries into the initrd and only the data files into the HDD.

Having a basic tiny initrd/initramfs, to which we can add anything needed for each system, with a tiny custom kernel, and booting them with PXE has saved us more than one catastrophe and makes our work easier.

The advantages of using initrd/initramfs are considerable:

RAM: memory is cheaper, faster, less energy-consuming and less susceptible to failures than HDDs or even SSDs.
Modern hardware, more RAM: new hardware supports a great amount of RAM memory installed.
Easier crash recovery: if a server crashes or someone accidentally erases a file, you just need to reboot it, and it will wake up again without any problems. Also, if a server crashes because of a hardware problem, you just need to replace the broken hardware, add your new server in a DHCP config file and create a new MAC entry in the PXE server.
Easier backups: the size of the initrd is minuscule, and it's easier to have a copy compared with a whole system in HDD.
64-bit: 64-bit systems can handle a lot of RAM.
Easier scaling: adding a server is easy, just connect it, enable PXE boot in the BIOS, add an entry in the DHCP server and create a MAC entry in the PXE server.
Security: if an intruder modifies any system file on your server, it will be restored when it restarts. If the server is infected with a worm, virus or rootkit, just restart it and move on.

initrd as a Real Root Filesystem

One of the reasons administrators don't use initrd as a real root filesystem is because they know that changes will disappear when restarting the server if they are not incorporated into the initrd. The steps for doing this are annoying: uncompressing the initrd file, mounting the filesystem, making changes, unmounting, compressing and restarting the server to take the new changes.

To work around this issue, we consider an initrd as a good enough filesystem when there's almost no need for modifying it.

In order to achieve this, we put files known to be susceptible to changes over time outside the initrd. These files normally are configuration files of services, scripts and files inside the /etc directory, such as ethers, resolv.conf or hosts.allow. These files are stored in our TFTP/SSH server. When the server boots, it gets its kernel and initrd, and then copies all these files via SSH.

For example, think of a caching proxy server. If you have a minimal basic initrd, you need to put only the libraries and binaries into it. But, the configuration file should be copied remotely from a different server where it can be modified in an easier way than having to modify the whole initrd filesystem.

All of these initrds, kernels and configuration files for each server are stored in our TFTP server. This server has DHCP, SSH and PXE services installed. When a server boots, it gets its kernel and initrd files from the TFTP server and the configuration files via SSH.

There's no need to restart the server with every change we make; we just edit the file on the TFTP server, get it with sshfs or rsync into the proxy server and restart the proxy service.

Perhaps a graphic will make it easier to understand—see Figure 1.

Figure 1. Here's an example of a proxy server with initrd as a real root filesystem. 1) Configure the proxy server BIOS to boot with PXE and configure the DHCP service to tell this server the IP of the TFTP server. 2) Configure the PXE service in the TFTP server to serve the kernel and initrd files to the proxy server. 3) The proxy server knows the IP of the TFTP server and asks it for the kernel and initrd files. 4) The proxy server boots and executes the /etc/rc script. This script mounts a directory of the TFTP server via SSHFS and copies the configuration files.

Introduction to the PACS Server

At my workplace, we have many servers without HDDs that get their kernels and initrds from a TFTP server and then copy their configuration files. We also have some servers with HDDs to store critical data, such as database files, but they also boot with PXE. And when you see that it works magnificently, you will want to implement it on as many servers as possible.

And, this is the reason why this story began—the story of an uncommon server that finally ends in a tiny initrd.

A PACS (Picture Archiving and Communication System) is a system that stores all kind of digital medical images, so other computers can consult them later. The universal format for PACS image storage and transfer is DICOM (Digital Imaging and Communications in Medicine). DICOM enables the integration of scanners, servers, workstations, printers and network hardware from multiple manufacturers into a picture archiving and communication system. At my workplace, we use dcm4chee for such purposes because it's very powerful.

We administer a university dental clinic with 56 medical boxes, each one equipped with a workstation that allows users to view and upload radiological images. These sorts of images also are generated and uploaded from different kinds of scanners: intraoral, 2D/3D and CAT. For medical and academic purposes, teachers and students can work and do research from a lab classroom equipped with 15 workstations. Even though the images come from different manufacturers' equipment, all of them are stored in the same format (the DICOM standard) and in a unique repository, a PACS server.

The images are stored in a big disk array with 10TB connected with fiber to the PACS server, with all the image data (date, modality, description) and patient information (unique ID, name, sex, birth date) in the database. We have more than 50,000 patients stored right now.

We also have a DICOM Worklist service in the PACS server. This reduces operator search time, and it allows for a list of cited patients on a certain day to be consulted from any computer and uploads images only for those patients.

With the DICOM Worklist, it's easier to search and upload images for the correct patient, because if you search for a patient ID that is not in the DICOM Worklist, it will not appear in the list, and you won't be able to upload a patient image that is not in the DICOM Worklist. The possibilities of uploading an image for the wrong patient is significantly reduced.

The clinical management software runs on a cluster of Web servers and allows us to list all the patients, their personal data and medical exams, and to view and create their diagnoses or treatments.

This is interesting: this cluster has eight servers. Seven of them run all in RAM without HDDs, except one, which has an HDD only for the database. Only the TFTP server boots from HDD, because it is the TFTP server of the cluster and has all the kernels, initrds and configuration files of all the other servers in the cluster.

Five years ago, we mounted two PACS servers in Linux. Because of the complexity and the little knowledge that we had, we didn't create initrds for them.

These two PACS servers are PACS-Master and PACS-HA (high availability). Both have a big disk array attached with 10TB to store the images. When someone uploads an image, PACS-Master saves it and sends all the records and images to the PACS-HA. If the PACS-Master has a problem and doesn't respond, the keepalived service gives the control to PACS-HA.

PACS Server to RAM

Through the years, we have had to change the HDD of PACS-Master twice because of failures. We have backups of the database and an image of the whole HDD, but the size of the image is very large, and with the new changes, we needed to create it again. That is not KISS.

We decided to upgrade the server to 64-bit, creating a minimal initrd, upgrading the dcm4chee version and the old database, importing the images and studying how dcm4chee and the DICOM Worklist work to know which files would be included in the initrd and which ones into the HDD.

Table 1 shows the specifications of both the old 32-bit server and the new 64-bit one of PACS-Initrd.

Table 1. The Specifications of Both an Old 32-Bit Server and the New 64-Bit PACS-Initrd Server

PACS-Master/PACS-HA	PACS-Initrd
CentOS 5.2, 32 bits	CentOS 5.2, 64 bits
4GB RAM (3.3 available)	16GB RAM
dcm4chee 2.14.2	dcm4chee 2.17.3
DICOM Worklist	DICOM Worklist
Java 5 32 bits	Java 6 64 bits
PCI Fiber	PCI Fiber
Big disk array 10TB	Big disk array 10TB

I'm not going to go into how to create an initrd or how to upgrade dcm4chee here; those topics are more technical and beyond the scope of this article.

The real difficulty was learning how the dcm4chee and DICOM Worklist services operate and which files had to be located in the initrd and which ones in the HDD. Because initrd is a block device with a limited size, if you have files that grow over time and exceed that size, you'll get a kernel panic.

For example, think of a directory that you know a service uses and that contains files that increase in size over time. You only need to create in the initrd a symbolic link with ln -s to the HDD:

root@pacs1 /dcm4chee/server/default]$ ls -la | grep tmp
lrwxrwxrwx  1 root root   38 Mar 13 17:27 tmp -> /HD/dcm4chee/tmp/

It's important to remember that if you change the HDD, you also need to create the directories again, or the symbolic link will be broken.

When we got the job done after months of work and tests, we got a kernel of 2.2MB and an initrd of 204MB compressed, 420MB uncompressed. Normally, our initrds occupy less than 100MB uncompressed. You might think that this initrd is not very tiny, but keep in mind that only the size of the dcm4chee and the DICOM Worklist occupies 80% of the total size.

Into the initrd and HDD

We have incorporated all these applications and services into our tiny initrd filesystem: PostgreSQL, Postfix mail server, NTP, SNMP, syslog, OpenSSH, Keepalived, cron, dcm4chee, DICOM Worklist and Java 6. Consider that all of those services don't start on the HDD; they do it in RAM!

We have incorporated these data files into our HDDs: the database, the dcm4chee and the DICOM Worklist files that increase in size over time.

PACS-Initrd at Startup

The PACS-Initrd boots first from Ethernet, so we enable PXE boot in the BIOS. We add an entry for this server in DHCP and the IP of its TFTP server:

[root@tftp-server /]# cat /etc/dhcpd.conf | grep -i pacs1 -A 5
host pacs1 {
            hardware ethernet xx:xx:xx:xx:xx:xx;
            fixed-address 10.0.0.12;
            filename "pxelinux.0";
            next-server 10.0.0.1;
           }

We configure the PXE service in the TFTP server to serve the kernel and initrd to PACS-Initrd:

[root@tftp-server /]# cat /tftpboot/pxelinux.cfg/pacs
prompt 1 
timeout 5
default L
label L
kernel kernels/kernel-2.6-64-pacs
append rw root=/dev/ram0 load_ramdisk=1 ramdisk_size=524288
 ↪initrd=initrd/fs-pacs.gz 
 ↪ip=10.0.0.12:10.0.0.1:10.0.0.7:255.255.0.0:pacs1
ipappend 1

PACS-Initrd starts with that kernel and initrd files and executes the /etc/rc script. This script mounts a directory of the TFTP server via SSHFS and copies all the configuration files that are often modified into the PACS-Initrd, such as Keepalived, PostgreSQL, the /etc directory or cron, and unmounts the directory:

echo "Getting config files via SSHFS ..."
sshfs user@10.0.0.1:/servers/ /mnt

/bin/cp -aL /mnt/pacs/db /
/bin/cp -aL /mnt/pacs/etc/* /etc/
/bin/cp -aL /mnt/pacs/home/sysman/* /home/sysman/
/bin/cp -aL /mnt/pacs/root /
/bin/cp -aL /mnt/pacs/var/* /var/
/bin/umount /mnt

/etc/rc.local

PACS-Initrd executes /etc/rc.local and launches all the services.

Creating a High-Availability PACS Server

We waited a few months before changing the PACS-HA to an initrd filesystem. During that time, PACS-Initrd worked perfectly without problems. We had no complaints from our users—and as you know, no news is good news.

Once we had an initial initrd, creating PACS-Initrd-HA was done in 15 minutes (not taking into account the time needed to import and upgrade the old database). We added the new server in DHCP for it to attack the TFTP server and download the same kernel and the same initrd from PACS-Initrd.

Some services needed different configuration files for each PACS server, so the solution was to create new configuration files in the /servers/pacs/ directory of the TFTP server. For example:

[root@tftp-server /servers/pacs/etc]# ls -la | grep keepalived
-rw-rw-r--  1 root root  552 Mar  7 11:24 keepalived-pacs1.conf
-rw-rw-r--  1 root root  551 Mar  7 11:24 keepalived-pacs2.conf

In order to avoid having two different directories for both PACS servers, we configured /etc/rc.local to choose the appropriate configuration files for each PACS server:

[root@pacs1 /]$ cat /etc/rc.local
#!/bin/bash

echo -e "\n--- rc.local ------"

echo "Some extra configuration ..."
HOSTNAME=`/bin/hostname`

echo "Configurations for $HOSTNAME ..."
case $HOSTNAME in
    "pacs1" | "pacs2")
        mv /home/sysman/host-$HOSTNAME.conf /home/sysman/host.conf;
        rm /home/sysman/host-*.conf;
        mv /etc/postfix/main-$HOSTNAME.cf /etc/postfix/main.cf;
        rm /etc/postfix/main-*.cf;
        mv /etc/keepalived-$HOSTNAME.conf /etc/keepalived.conf;
        rm /etc/keepalived-*.conf;
        ;;
    *)
        echo "ERROR in HOSTNAME ($HOSTNAME) ...";
        exit -1;
        ;;
esac

Then we formatted an HDD partition in ext4, initialized the database, imported and upgraded the old database of PACS-HA and created the directories that come with a symbolic link from initrd.

Other Configurations

Security:

We secured the common binaries with chattr with the -i (immutable) option. An intruder gaining access won't be allowed to change a common binary for a rootkit because it has the immutable attribute. And, of course, our initrd doesn't own the chattr binary.

We configured TCP wrappers to deny all SSH connections except the ones needed by the servers to communicate with the PACS server.

We configured the ethers file with a static MAC to prevent a MITM (Man-In-the-Middle) attack.

We configured snmpd.conf to deny all queries except ones coming from our monitor server.

Centralized Logs:

We configured syslog to send all logs to another server that has the syslog-ng service running.

Backup:

At night, our backup server backs up the database and the configuration of the dcm4chee service.

Internal Monitoring:

We use SMART (Smart Monitoring and Rebooting Tool) for such purposes, because it's not only a passive monitor, but it's also an active monitor that can auto-recover a service without the system administrator's intervention if it detects that it's down.

We added dcm4chee and the DICOM Worklist services to SMART.

Figure 2. Viewing All the Services We're Monitoring with SMART

External Monitoring

We monitor the server behavior and status with SNMP and Zabbix. We also monitor the services with Nagios.

Figure 3. The Zabbix Control Panel Monitoring the PACS Server with SNMP

Conclusion

The benefits of doing this are considerable. We saw in Zabbix that performance improved substantially. Backups are simple because the size of the initrd is minuscule, and it's easier to have a copy compared with a whole system image of the HDD. And, we need to back up only one initrd, because PACS-Initrd and PACS-Initrd-HA use the same one. If any file is corrupted or accidentally deleted, the solution is just to restart the server, and it'll wake up again in a healthy state without having to re-install the whole OS. If the HDD gets broken, it will be easy to recover, because it stores only the database and the configuration files and not the whole OS. If the hardware breaks down, we just have to replace it and add a new server in DHCP and a MAC entry in the TFTP server. The server's security is improved, because if we restart the server, any changes made by viruses, worms or rootkits will disappear. Mounting a second PACS server was easy, because we already had the initial initrd. The most time-consuming part was creating the configuration files, adding a few lines in rc.local and a new entry in the DHCP service.

I hope this article helps you consider the benefits of having a server starting with PXE, and I encourage you to do it on many of your servers.

Acknowledgements

PACS-Initrd was created, developed, tested and enjoyed in the IT Department of the UIC Barcelona. Albert Martorell, Isaac Vázquez, Andreu Garcia, Ildefonso Aranda, Josep Pablo, Vicente Sangrador and Jordi Xavier Prat have collaborated on this project and encouraged me to write this article. Thanks to this awesome team! Without you, this would not have been possible.

Finally, thanks to my girlfriend for always having the patience to listen to me when I talk emotionally to her about the PACS server, initrds and kernels without understanding anything at all.