Improving boot performance with Bootchart

Shining Boot


Bootchart analyzes the boot process and tells you where the system is wasting time.

By Kristian Kißling.

Markus Langer, Fotolia

In this article, I explain how to deploy Bootchart to investigate the boot process and discover where system optimization can be applied to maximum effect.

One of the gripes about Linux is the amount of time it takes to boot. When you switch on a Linux-based mobile phone, you don't want to wait half an hour before you can start to use it. Linux desktop users aren't infinitely patient either, and developers have introduced various tools over the years to tackle the issue of boot time. If you make the effort to analyze the boot process, the results can be remarkable. The Moblin2 distribution boots from a solid-state drive in just fie seconds [1], and the boot time for the usual Debian on an Asus Eee PC 901 can be reduced to a fast 14 seconds.

A handy tool called Bootchart [2] investigates the boot performance of a Linux computer. The application painstakingly logs the boot times for individual services and processes, then it converts the data into a lengthy Gantt diagram and outputs it in EPS, PNG, or SVG format. The diagram serves as a guide for directing your performance-tuning efforts.

Bootchart is installed easily on Ubuntu 8.10 and openSUSE 11 with the distribution's package. Packages are available for other distros, too. On openSUSE, you additionally need to add the init=/sbin/bootchartd option as a kernel parameter in your Grub boot menu (see the box titled "Bootchart for openSUSE 11").

Introducing Bootchart

Bootchart consists of a shell script that runs before the init process. The script calls init, then it starts logging data. Bootchart gleans information from the /proc directory or, to be more precise, /proc/stat, /proc/diskstats, and /proc/*/stat. (The asterisk is a wildcard that represents any process that binds /proc.)

Bootchart uses the virtual tmpfs filesystem to store the data in RAM initially: It stops collecting data when it sees the names of a couple of typical processes that occur when the login screen is displayed, such as gdmgreeter and kdm_greet. It then swaps the collection from RAM into the /var/log/bootchart.tgz file on your hard disk.

In Ubuntu, a Java application first automatically converts the data into a PNG graphic, which is stored in the /var/log/bootchart directory. In openSUSE 11, you need to enter the following at the command line to do the same thing:

$ sudo bootchart --format png

The graphic generated from the TGZ archive is then stored in the directory in which you issued the command.

Because the boot procedure is something like a race between dozens or hundreds of processes, the Bootchart graphic can be quite confusing. To make life easier for users, Bootchart comes with a tree-pruning function that lets you hide inactive and very short lived processes. Also, you can group similar processes for easier viewing.

Bootgraph

Before you start your analysis, hand over the stage to Bootgraph, a utility that comes from the Linux kernel developers. Bootgraph is included with the kernel scripts starting with version 2.6.28. In combination with a simple Perl script, Bootgraph shows what the Linux kernel does at boot time, and it offers a couple of optimization options.

To use Bootgraph, you first need to type uname -r to find out whether your kernel version is 2.6.28 or later. If so, reboot your system and add a new kernel parameter, initcall_debug, to the Grub boot manager, as described in the "Bootchart for openSUSE 11" box.

Now reboot your computer with the new option enabled and type the following simple command to create a vector graphic:

$ dmesg | perl /usr/src/kernel/scripts/bootgraph.pl > /home/User/bootgraph.svg

This command uses a Perl script to dissect the output from dmesg and store the results - an SVG-formatted graphic - in the home directory for the user. To view the graphic (Figure 1), use the Firefox browser - although this gives you a landscape view - or use a vector drawing program like Inkscape to rotate it 90 degrees.

Figure 1: The Bootgraph script reveals that the kernel had only a minimal influence on boot time in our lab.
Bootchart for openSUSE 11

OpenSUSE 11 offers two approaches to specifying additional boot parameters: Either you can enter them permanently in the /boot/grub/menu.lst file - you need to add them to the line starting with kernel in the section for your current distribution - or you can type the line directly in the Grub boot menu to run it once. To enter the information in the Grub boot menu, use the arrow keys to move to the right boot entry, and press E to edit. Then add init=/sbin/bootchartd to the end of the line starting with kernel.Pressing Enter confirms the change, and pressing B will boot your machine with the modified line.

Seeing is Believing

At the top of the Bootchart-generated diagram (Figure 2) are various statistics that give you the date of the test, the name of your Linux distribution, and your kernel version. Below this, you will see details of your CPU and the enabled kernel options. The time that elapses during the boot phase is listed next to the time keyword.

Figure 2: Before optimizing: Bootchart generates a chart that shows where the boot process is wasting time.

The two diagrams that follow show the CPU load during the boot phase and the input/output activity. Valleys represent idle processor time. Because the diagram shares the X axis with the main chart, you can also see which processes are not stressing the CPU sufficiently and thus are wasting valuable boot time. The next diagram looks similar but relates to the hard disk, showing the data throughput and active time for the disk.

The main part of the Bootchart graphic starts below this. The X axis represents the number of seconds that elapse during the boot process. For easier orientation, the diagram includes thin vertical lines at one-second intervals.

This section comprises a number of labeled bars that run from left to right. Each bar represents a single process during the boot phase. Although some of the bars terminate after a couple of seconds, others drag through to the end: These processes keep running after the boot phase terminates and include services such as the CUPS or the ACPI daemon. To determine when a process starts and when it ends, see where the horizontal bars intersect the vertical timelines. Child processes are connected to the parent processes in the graphic by dotted lines.

What to Cut

When optimizing boot times, you will mainly be interested in the vectors between various processes starting on the vertical axis. How long does a process wait before it allows the next process to start? Do some processes block the boot procedure for an excessively long time? Do I have any options for parallelizing some of the boot processes?

On the horizontal axis you will want to see whether you really need all the active processes. To do so, first correctly identify the processes and evaluate their functions; however, do not uninstall the programs or move their init scripts (e.g., hwclock.sh) to another directory until you are absolutely certain that your system does not need the process to survive. These optimizations involved very little effort and reduced the boot time in our lab from 33 to 26 seconds (Figure 3) - all without massively invasive system tweaking.

Figure 3: Targeted uninstalling of programs will reduce the boot time. In our lab, we managed to save seven seconds.

No universal recipe exists for accelerating the boot process. Many Linux distributions use Bootchart to optimize the boot time for the default installation, but you are likely to find even more time-saving options on your own computer. In our lab, we uninstalled Samba, Tor/Privoxy, and Bluetooth and moved the hwclock.sh init file to reduce the boot time. If you only print occasionally, you don't need to launch CUPS when you boot the system; instead, you could disable the CUPS daemon by default and move the start script out of the /etc/init.d/ folder. The same principle applies to other services that you only use occasionally - but remember to create backup copies before you make any changes to the scripts.

After taking care of the more obvious choices, such as disabling various automatically loaded services, you need to tackle some of the more complicated options for accelerating the boot process. These subtler alternatives mainly relate to the kernel and other system components.For example, you could build in hardware support modules and completely do without initrd and initramfs. Doing without hwclock, the tool that sets the system clock, is not as significant, but it could still save you some time. Then, once the system boots, cron could handle this job on a regular basis.

Hardware optimization options include booting the CPU at the fastest supported clock speed (if this is not done automatically). Also, you can save time with the Udev daemon: Whenever a machine is booted, the daemon tries to detect and enable all the devices. To save yourself this long-winded search, run a script by Phil Endecott (the guy who optimized Debian to 14 seconds) to automatically mount all the devices it has detected in /dev at boot time. However, this optimization comes at the price of losing the flexibility of plugging in new hardware.

Conclusions

Bootchart does not give you specific tips for accelerating your system boot time, but it does show you what is slowing the system down. Then you need to draw your own conclusions to know where to tweak the boot process. Of course, there is always the danger of optimizing your system to death and losing any time you gained in complicated repairs. Before you do anything serious, make sure you know exactly what effect the change will have on your system.

INFO
[1] Moblin boots in five seconds: http://www.linuxdevices.com/news/NS7654890804.html
[2] Bootchart: http://www.bootchart.org