An overview of current capabilities and achievements in Linux audio development.
The world of Linux audio covers many domains, from basic desktop sound services to embedded systems, from simple Internet telephony to the demands of professional recording studios. This article presents an overview of the Linux audio world and its current status.
Due to the breadth of the topic, I have divided this article into two parts. For the same reason, it is impossible to discuss any particular program in-depth in this kind of survey. However, I cover many of the programs mentioned here in my articles for the Linux Journal Web site (www.linuxjournal.com), and I refer readers to those articles for more detail on individual programs.
Sound support in Linux has progressed grandly since my first experience with the system in the mid-1990s. The mainstream distributions all have excellent device detection, including sound card detection, and the typical desktop audio functions are configured transparently during installation. Most distributions let users add and configure extra sound devices manually, but some detect and configure multiple devices automatically. Any configuration needed after installation is handled similarly through control panels and other user-friendly utilities. In addition to technical advances, Linux sound and music applications have grown in number and sophistication. We now have excellent software for media production and playback, and there is good reason to expect continued development in audio-oriented domains.
In this overview, I distinguish between the two broad categories of system software and applications software. System software here includes the kernel sound system and other tools and utilities that make the user-level programs work. This software is usually not associated with normal usage, and typical users may, in fact, never even know about it. Nevertheless, this layer is where the heavy lifting is done, and although it's not flashy or sexy, it is the heart of the Linux audio system. In contrast, applications software includes the programs that present themselves to users via the distribution's menus, toolbars and file managers. This software is what typical users understand and employ on a regular basis.
The first part of this article covers system software and a variety of other audio-related software domains. The second part focuses on the state of Linux sound and music production software.
ALSA (Advanced Linux Sound Architecture) provides the core audio and MIDI services to the Linux kernel. These services include the device drivers installed with the kernel, a library and API for programmers, various user-level tools and utilities, and firmware for some USB and other devices. If a project's development is reflected in its changelogs, ALSA is clearly a very active project, with a steady stream of enhancements and fixes, and an expanding list of supported sound cards and audio chipsets.
The developers at 4Front Technologies have improved their OSS (Open Sound System) Linux package in similar fashion. In 2007, the company announced the decision to place the system under open-source licensing. As a result, OSS is now a free, open-source project, complete with source repository, Bugzilla, wiki and protection by the GPL, BSD and CDDL licenses. But, all this goodness isn't only for Linux. The OSS package also provides high-quality audio/MIDI services to our comrades on UNIX systems, such as FreeBSD and Solaris.
ALSA and OSS provide the device drivers needed to make your sound hardware usable by the operating system. Sometimes they create these drivers by consulting material provided by manufacturers, and sometimes they reverse-engineer a driver. To my knowledge, only Audio Science offers Linux drivers developed in-house. Audio Science manufactures high-quality audio hardware marketed mainly to radio broadcasters, and codes and provides native Linux drivers for its products. Ah, if only [manufacturer's name deleted] would be so wise.
Normal desktop actions and activities that require audio services include system sounds, media players, Internet telephones and simple recording. However, normal users now expect amenities, such as transparent software mixing and relatively glitch-free performance, in a multitasking system. ALSA's dmix plugin provides software mixing, but not all distributions want to employ it. Thus, competition remains for the position of the default Linux desktop audio server. GNOME still uses esd (the Enlightened Sound Dæmon), and KDE still backs the aRts dæmon, but the PulseAudio Project definitely is the newcomer to watch. PulseAudio already has been adopted as the sound server of choice for the OLPC XO laptop and for recent releases of Ubuntu, and there's reason to believe it may overtake esd and/or artsd as the One True Server for typical users' sound-related activities.
The demands of professional audio production require a different order of performance from a sound server. None of the servers mentioned above are capable of drop-out free performance under heavy resource demand—for example, multichannel recording with high sample rates and bit depths—and they cannot be considered suitable under pro-audio conditions. Fortunately, Linux has JACK, a truly professional-grade audio server and master transport system. If you plan on producing professional-quality audio with Linux sound and music software, you need to know JACK.
JACK development is steady and continues to expand the system's capabilities. JackMIDI is showing up in more applications, and the jackdmp Project points the way toward JACK's future on multiprocessor architectures. Current versions already run on OS X, and there's even been a successful port to Windows. Currently supported back ends now include ALSA, OSS, PulseAudio, FreeBob/FFADO (for FireWire devices) and CoreAudio (on OS X).
Erik de Castro Lopo has contributed some essential components to the Linux audio infrastructure. His libsndfile provides programmers with a comprehensive library for handling file I/O for a great variety of soundfile formats, and his libsamplerate has found broad acceptance as the preferred tool for high-quality sample rate conversion. These libraries relieve applications programmers from the burden of writing code for very common tasks, and both packages are common dependencies throughout the world of Linux audio software. I'm also happy to report that both libraries are currently maintained.
The sound-related software most familiar to typical users includes media players, games and audio communication devices. In each instance, the application itself does not handle audio directly, instead relying on the kernel's sound API (that is, ALSA). This reliance frees application developers to focus on features, rather than on how to interface with users' sound hardware.
Linux music players are a mixed blessing. For average users, programs such as Amarok, Banshee, Rhythmbox and the XMMS clan work well for playing most audio formats (MP3, Ogg, WAV, AIFF and so on). AlsaPlayer continues to provide a lighter-weight player, not so feature-rich but stable and JACK-savvy. JACK-awareness is one of my personal complaints with most of the current music players, but I have reason to hope that the major players will get it and at least provide a plugin for JACK connectivity. And, while I'm dreaming, I'd also like to see Linux media players adopt the JACK master control system. More typical wish lists include true gapless playback and support for huge collections. The development teams of the popular players are quite aware of these requests and are working to address them in future releases. As Figure 1 indicates, some developers are indeed moving forward.
Multimedia players, such as MPlayer and Xine, continue their development march forward. These projects are well established, and many users rely on them heavily for more than just DVD and video file playback. MPlayer (and its sister software MEncoder) is a veritable toolkit for a wide variety of video and audio tasks, and the Xine library is used by many other applications that need video capabilities. Both programs play a wide variety of video and audio formats, and both include hooks for user-friendly GUIs.
Alas, there is a snake in the grass in this field. Video players depend on codecs that provide support for the seemingly endless variety of video formats, and many popular formats are patent-encumbered. One immediate result of this situation is the difficulty or impossibility of including these codecs in a mainstream Linux distribution. Some distros simply point users to a repository where they can download the necessary packages, but it would, of course, be better if the codecs could be installed along with the players. However, until patent law reform takes place (in the US, at least), there can be no other way to supply the software.
Playback of encrypted DVDs also is problematic. It appears that the MPAA is no longer pursuing legal action against distribution of the DCSS software, but distribution vendors remain hesitant to include the software directly. Again, users typically are directed to a distribution point on the Internet where they can acquire the software they need to watch their legally purchased discs. Although these extra steps may seem trivial to seasoned users, they often are confusing and seem unnecessary to novices, especially when there is little or no understanding of the legal ramifications. Nevertheless, until patent encumbrance and copyright entanglements are things of the past, the extra steps will be necessary if users expect fully functional media playback.
Beyond PySol and XScrabble, I'm not much of a gamer. However, I do follow the updates on the Linux Game Tome and the Linux Games sites, and the scene for Linux gaming and game development clearly is alive and active. Game-centric programming toolkits flourish; new games appear frequently (with the attendant and predictable variability in quality), and even the occasional port from Windows shows up. The common critique I hear from avid gamers is that Linux is a great platform for running games, but too few great games exist in native Linux versions. Indeed, Windows users can claim a massive number of high-quality games available only for that platform, but from this dabbler's perspective, the Linux gaming world is healthy and developing nicely.
Most of the currently maintained game development toolkits (ClanLib, Crystal Space and SDL) support ALSA and OSS, but the Allegro library also supports JACK, which I think is very cool. The OpenAL Project still is under development, but slowly. Creative Labs and Apple have invested in the system's development, mainly for Vista and OS X, but it appears that 3-D and surround sound (5.1, 7.1) are fully supported in the Linux releases as well.
Linux-powered portable hardware is common these days, so we can expect to encounter the Linux sound system at work in those devices too. Alas, I own no such devices and cannot directly comment on implementation and performance of the sound system in that hardware. However, LinuxDevices.com publishes a handy on-line list of Linux-powered audio/video devices, most of which are media players, set-top boxes, integrated media phones and so forth. Two notable exceptions to the category include Ron Stewart's amazing Trinity, a portable Linux-powered DAW (Figure 2), and the Plugzilla, a rackmounted standalone audio plugins player. I don't own either of those units, but both should be tested and evaluated as soon as possible.
The Wine Project has reached its 1.0 release stage. Among its many virtues, we find support for a variety of audio/MIDI back ends, including ALSA, JACK and OSS. Some sound and music programs for Windows run flawlessly with Wine, including Cockos Software's excellent Reaper audio/MIDI sequencer, thanks to work on the wineasio driver. This driver communicates with Wine's JACK support to yield surprising low-latency performance when running ASIO-compliant Windows applications under Wine, including VST/VSTi plugins. However, even with wineasio, it still is unlikely that the major music and sound packages for Windows (Cubase, Logic, Finale and so on) will run flawlessly under Wine. Those programs tend to be large packages with a complicated relationship to the operating system, typically more complicated than can be emulated with Wine.
Ardi's Executor, a Mac OS emulator, is gone, but at least two good Atari emulators remain. If you want to run all that late 1980s MIDI music software written for the Motorola 68K CPUs, XSteem and Hatari will do the job. Alas, the Steem Project appears to on hold, but Hatari is in current development.
The DOSemu Project continues on its steady development track. Recent releases include significant improvements to the emulator's sound and music capabilities, better integrating its functions with the kernel's ALSA system. The DOSBox Project supports sound through the SDL audio library, with special emphasis on game sound compatibility. MIDI output is supported, but current versions lack MIDI input capability.
Emulators may become relics if virtualization delivers equal or better performance. I have not yet tested music and sound applications in environments such as VMware or VirtualBox, but the specifications for those systems typically include ALSA and OSS support via virtualized hardware. Unfortunately, the virtual sound devices are compatible typically with the SoundBlaster16 or Intel's ubiquitous AC97 audio codec. These devices are sufficient for low-demand programs, but they are not suitable for use with high-end music and sound software for Windows.
A few intrepid commercial sound and music software houses have offered Linux ports of their packages. The Renoise tracker (Figure 3) is available in an excellent version for native Linux. Jorgen Aase's energyXT2 DAW (digital audio workstation) has a sizable base of Linux users, and Garritan recently announced that Aria, its next-generation sampler engine, will be available in a native Linux version. Other vendors, such as NCH Software (WavePad) and Cockos (Reaper), advertise that their programs work with Wine and extend official support to that environment.
The number of these packages hardly constitutes a flood of releases from major Windows developers, but such small streams can grow. More users are becoming interested in Linux, and some percentage of those users will be focused on its audio capabilities and its applications for sound and music production. An opportunity exists for commercial developers to expand into the Linux world, and their way has been made clear by the solidity of the Linux audio infrastructure. I applaud the houses that already have crossed into the Linux world, but it remains to be seen whether these motivations and attractions are strong enough to compel other commercial houses to develop native Linux packages of their software.
The Linux audio infrastructure provides well-designed and well-tested programming interfaces for sound and music applications developers, particularly if they employ JACK to handle audio (and now MIDI) I/O. Alternatives include the OSS API and directly programming ALSA, but JACK is truly the superior solution.
Regarding GUI toolkits: Qt and GTK remain the dominant players, but FLTK and wxWidgets also are popular. This multiplicity of GUI toolkits has been a problem for plugin developers, although the emerging LV2 specification may resolve that issue.
Python and its GUI bindings have become popular for some types of music applications, Tcl/Tk remains a popular scripting language for smaller applications and rapid prototyping, and Java programmers have added a sizable number of excellent applications to the Linux audio software armory. Java audio programmers also can employ JJack, a JACK audio driver for the JavaSound API. At this time, only the Frinika sequencer makes use of JJack, but I hope to see it receive the attention it deserves.
The JUCE multiplatform development environment provides excellent tools for developing audio applications. The JUCE framework is fully JACK-compliant, but unfortunately, its adoption has been slow so far. Current implementations include Rick Taube's GraceCL (next-generation algorithmic music environment), Kjetil Mattheussen's Mammut (massive FFT audio transformer) and Lucio Asnaghi's JOST plugin system. These programs all have attractive GUIs with excellent audio capabilities—all courtesy the JUCE framework.
Audio/video-optimized Linux distributions are flourishing. Stand-out systems include Planet CCRMA, 64 Studio, JAD, MusiX, Dynebolic and Ubuntu Studio. Some of these distros include ISO images for making live CDs that can be used to test the system without installing it to your hard disk. All of them have been engineered for low-latency performance, and all are currently maintained. These distributions are the Linux audio novice's best friends; they are highly recommended for anyone who wants to work seriously with audio/MIDI on Linux.
Ivica Ico Bukvic is the current director of Linuxaudio.org, which is “...a not-for-profit consortium of libre software projects and artists, companies, institutions, organizations and hardware vendors using Linux kernel-based systems and allied libre software for audio-related work, with an emphasis on professional tools for the music, production, recording and broadcast industries.” Among its many purposes, the organization functions as a portal to a variety of “priority” links, including URLs for an applications index, a software mirror, a VST plugins compatibility database and other useful resources.
Linux audio developers meet annually at the Linux Audio Conference, held in Koln in 2008. Rumor says that LAC2009 may be held in Parma, Italy, but no definite plans have been made at the time of this writing. This conference is the event of the season for Linux sound folk—a four-day fiesta of presentations, performances and much sharing of ideas, code and music. Keep an eye on the LAC link page at Linuxaudio.org for news of next year's conference, and be there if you can.
Program-centric communities have evolved around the maintained projects. Communications channels include the typical forums, wikis, mailing lists and IRC channels, but they now include channels, such as YouTube and MySpace. YouTube has become an especially useful channel for demonstration and instructional videos. Some examples of Linux audio software in action can be found there now, and I expect more to appear.
A wide variety of music made with Linux software can be heard at Hans Fugal's LAM site. Other good sources for Linux-made music include the Linux Audio Users mailing-list archives, the Internet Archive and, of course, the forums and other comm channels mentioned above.
In my opinion, the Linux audio infrastructure is now a solid structure, with exceptional capabilities and provision for future development. JACK is by itself a most remarkable achievement, and it has become the cornerstone for all serious audio applications, particularly in the pro-audio domain.
Configuration has been all but completely automated during installation, and post-installation configuration has become a no-brainer in most distributions. Distribution developers deserve high praise for the work done in this regard. Again, it's not sexy stuff, but it makes a great difference to the newbies and even to the not-so-newbies.
Audio performance on the normal multitasking desktop has been a problematic point, but the PulseAudio Project promises a satisfactory resolution to that problem. Only time will tell if its adoption becomes widespread.
Normal applications that require audio support are well served by the current software map. Requested features are being implemented, and usability has improved greatly since Ye Olden Times. With software mixing and relatively xrun-free playback, the desktop audio system is looking and sounding better all the time.
In Part II of this article, I'll assess the current state of development of Linux sound and music applications. Until then, stay tuned.