This article describes the process of porting a variety of audio applications from the SGI platform to the Linux operating system.
The process used to port SGI audio applications was reflective of Linux's own distributed developments—a truly international collaboration dependent on Internet communication. It is still a work in progress, with improvements and extensions to the software being created and contributed by programmers around the world.
NoTAM is the Norwegian network for Technology, Acoustics and Music research located at the University of Oslo. Professor Oyvind Hammer of NoTAM wrote a series of applications designed to aid musicians and researchers in the analysis and representation of sound. Written originally for Silicon Graphics (SGI) computers, these applications utilize the graphics capabilities of the X Window System and make use of SGI's audio and sound file systems. Many applications offer basic and advanced editing features as well as sound file playback capability.
Although not traditionally thought of as an audio platform, Linux already has several sound file editing and processing systems. Packages such as MiXViews, Snd, XWave and the CERES Soundstudio are available for audio recording, editing and playback. Many other packages can be found on the Internet. The Linux Soundapps page (see Resources), which I maintain, is a comprehensive and up-to-date list of Linux-based audio applications.
The editing capabilities of the NoTAM software are of a different nature: edits are performed on the graphic results of a Fast Fourier Transform (FFT) of a sound file. Explaining FFTs is beyond the scope of this article, but the results of a transform are usually depicted in a “spectral representation”, i.e., a representation of frequency versus time. With the appropriate software, edits can then be made directly on the frequency content of a sound. Until recently, Linux had no such software, so porting the NoTAM applications to Linux was an attractive prospect.
When the NoTAM source code (made publicly available by Dr. Hammer and NoTAM) was examined, I noted that the graphics code consisted of plain X calls, and its sound support consisted of calls to the SGI-specific audio and audiofile libraries. Many of the applications were built with Motif. Since the necessary X libraries and LessTif (a freely available replacement for the Motif libraries) were already available, all that was needed in order to do the port were replacements for the audio and audiofile libraries. I contacted Dr. Hammer and asked him for permission to try porting the software to Linux, and with his gracious consent, the porting project began.
Looking over an excellent web page dedicated to SGI audio applications (maintained by Doug Cook), I noticed a sound-file editor named DAP (Digital Audio Processor), written by Richard Kent. DAP uses the XForms libraries, so I inquired about the possibility of a Linux port. Richard informed me that he had already written such a port, and when I mentioned I wanted to port some other software written for the SGI to Linux, he graciously supplied the code used in DAP to replace the SGI audio and audiofile libraries and headers. The Linux versions of libaudio.a, libaudiofile.a and their associated header files are virtually direct “plug-in” replacements, meaning I could leave the NoTAM sources relatively intact.
Armed with X11, Richard's replacement code and LessTif, I attempted my first port. I chose Dr. Hammer's Sono. This program analyzes a sound file and creates a PostScript sonograph of the spectral analysis. Since Sono does not display the graphics, instead relying on external display mechanisms, the port was fairly trivial, requiring only minor modifications.
With this first success, I moved on to another relatively straightforward program, PTPS, which creates a PostScript graph of a pitch-tracking analysis. PTPS also compiled easily with only small changes, so I decided to attempt a more substantial port.
Ceres is an FFT-based program, but its design goes far beyond the simplicity of Sono and PTPS. Ceres renders the FFT results into a graphic display which can then be edited directly and saved as a new sound file. The challenge in porting Ceres was primarily in the X programming. Since no real-time aspects were involved, there were no problems with audio playback. There were, however, problems with the use of variable-length Xt argument lists which, in theory, must be terminated with a NULL entry. The SGI compiler and libraries did not appear to require this NULL; however, the Linux GCC compiler and libraries were stricter, and Ceres would fail with a segmentation fault upon opening if the NULL was missing. In addition, a problem with different maximum random numbers (RAND_MAX) between SGI and Linux caused a crash when using the Random Sieve transform. Once these two problems were solved, the porting of Ceres was complete.
I then decided to do an even more ambitious port, Dr. Hammer's Mix package. Mix is a 9-channel audiofile mixer with graphic waveform displays, graphic volume and panning curves, a scripting language for complex mixes, and real-time effects processing. (See Figure 1.) Obviously, audio playback capabilities are exploited to the full. I thought porting Mix would be by far the most difficult challenge, yet with Richard's replacement libraries (and some judicious code-cutting), I was quickly able to compile the Mix application, and Linux now has an excellent 9-channel, sound file mixer.
I placed the ported programs and their sources on the Music Technology Department's FTP server at Bowling Green State University in Ohio, I then notified Dr. Hammer of our successes (he was pleased we had achieved so much), and I also advised NoTAM that their software was now available for Linux. NoTAM obtained the packages and placed them on their server, making the applications easily available to everyone. I also sent notices to the Csound mailing list and comp.os.linux.announce to inform the Linux community of the availability of these packages.
Since the releases were made public, development has continued. Working from our source packages and Richard's library code, the Swedish composer Reine Jonsson has contributed a version of Mix which now handles the popular WAV format sound files (the original Mix, along with the other NoTAM packages, supports only AIFF format files), while reducing loading times and enhancing playback smoothness (a critical factor on my 486/120). A new version of Ceres (see Figure 2) called Ceres2 is in development by Johnathan Lee and should be available in a Linux port by the time this article is published.
Improvements can still be made: the applications ported so far are reasonably stable but will sometimes crash for no apparent reason. In some cases, not all of the original functionality is available, particularly if the package uses routines specific to the SGI's audio hardware capabilities. The Mix source code, for instance, includes calls to the SGI MIDI interface, but a replacement library for those calls has yet to be written. For now, I have had to disable the MIDI control code in the Mix sources. I have received a substantial amount of mail from users who have expressed interest in seeing more of this porting development, and my hopes are high that we will soon have a replacement for the SGI MIDI libraries to add to the audio and audiofile libraries already supported.
It must be mentioned that the NoTAM packages are not the only sources for high-quality UNIX audio-processing software available for possible porting. Guenter Geiger has successfully ported Paul Lansky's RT, another real-time, sound file mixer with excellent scripting capabilities. Work proceeds on ports of Paul Lansky's Ein (a DSP scratch pad), Mara Helmuth's Patchmix (a graphic patcher for the Cmix audio-processing language), and Russell Pinkston's XPatchWork (similar to Patchmix, but using the Csound language). Many other audio-related packages are available for Linux, and the interested reader should look at the Linux Soundapps web page for a continuously updated and comprehensive listing.
When I first used Linux I was thrilled by its possibilities, but dismayed by the lack of high-quality sound-processing software. Nevertheless, I was inspired by the availability of source code and the willingness of the Linux community to help develop audio applications. Since I am not a programmer, I relied on the skill, experience and advice of Linux users around the world. Like Linux itself, these projects were developed by a distributed collaborative effort, heavily dependent on the Internet for all communication, and built with freely available tools and libraries.
Thanks to Dr. Oyvind Hammer, Richard Kent, the LessTif developers and the XFree86 project, Linux audio software grows in quantity and quality almost daily. I encourage interested developers to contact me and let me know what they're working on or what they wish to work on. Many projects are waiting for developers who would like to contribute their talent and interest to the rapidly growing Linux audio and music software base.
by Richard Kent
When I first heard of Dave's project to port audio applications from an SGI-based environment to Linux, I was very interested in becoming involved—particularly because the sheer dearth of audio applications for UNIX was the primary reason for programming the Digital Audio Processor. I initially implemented DAP for SGI-based systems, but shortly before Dave contacted me, I successfully ported DAP to run in a Linux-based environment. (See Figure 3.) This experience helped greatly when porting the NoTAM applications. This sidebar is intended to detail the three main technical considerations when attempting such ports.
Most, if not all, SGI audio applications make extensive use of the excellent audio and audiofile libraries supplied with IRIX. The audiofile library provides an API primarily designed to allow effortless loading and saving of AIFF audio files. The audio library is designed to allow straightforward audio input and output as well as control global audio settings. In order to make porting as painless as possible, replacement libraries had to be written for the Linux operating system.
The audiofile library was tackled first. Since this library simply has to perform file I/O based on the calls made, it was relatively straightforward to set up the necessary AIFF structures and to initialize, load and save these structures as necessary. However, because samples are read from and written to disk one sample frame at a time, this straightforward port of the audiofile library is relatively slow. In addition, only AIFF files are supported—compressed AIFF-C and WAV format files are not.
The audio library was a more demanding port, requiring extensive investigation into the facilities provided by the Open Sound System (OSS/Free) driver which is, by default, compiled into the Linux kernel. The basic process when using OSS involves opening /dev/dsp and either writing sample data directly to the device or reading from the device. In addition, opening /dev/mixer allows control over the global audio settings.
The Linux conversion basically sets up internal sample queues and provides facilities to transfer these sample queues to and from /dev/dsp. Most complications which arose were due to the many different sample formats (resolution and number of channels) supported by both the audio library and the OSS driver, thus requiring many different data conversions on input and output.
The resulting audio library on Linux is very much a “brute force” conversion and differs significantly from the SGI-based library, despite the almost identical API. The main difference is that the Linux audio library is not threaded whereas the SGI-based library constantly inputs and outputs sample frames from a cyclic queue in the background. As a result, the API user needs to be aware that samples must be constantly written to or read from the device to avoid audio glitches. Also, when finishing sample playback, blank samples must be written to the device to force the remaining sample queue to play before closing the device. The other main difference is that only one audio “port” may be open at any given time, due to the exclusive nature of opening /dev/dsp.
The default SGI compiler is quite different from gcc, which is the most commonly used compiler on Linux. More specifically, the SGI compiler is “relaxed” and not nearly as strict as gcc. This manifests itself in several ways, which must be considered when porting software from one platform to the other.
The most obvious difference is that explicit casting is often required on gcc to avoid warnings and errors which do not occur when using the SGI compiler. Two examples are shown here.
Default SGI compiler accepts:
int x = 3.2; int *y = calloc (10,sizeof (int));
Linux gcc requires:
int x = (int) 3.2; int *y = (int *) calloc (10,sizeof (int));Correct link order is also more important when using the Linux gcc linker. The default SGI linker appears to place little importance on the order of link components (object files and libraries) when linking, as long as all “loose ends” are tied up by the end of the linking process. The Linux gcc linker, which I freely admit to not fully understanding, is much stricter and frequently requires reordering of link components and sometimes even duplication of linked libraries.
Another major difference between the SGI default compiler and gcc arises when combining C and C++ files where the C files cannot, for syntactic reasons or otherwise, be passed through the C++ compiler. When using the default SGI compiler, the command for compiling both C and C++ files is CC, so there is no need to explicitly specify linkage specification using the extern C construct. When using the gcc development environment, the command to compile C files is gcc and the command to compile C++ files is g++; thus, the linkage specification must be specified when referring to C-based functions within C++ files, or else linking will fail due to C++ name mangling.
One final major difference between SGI and Linux development environments is that of variable argument lists for Xt and Motif function calls. The Xt toolkit provides the developer with basic GUI constructs which may be used directly to create a user interface. In addition, Motif and LessTif use the Xt toolkit to provide a higher-level API for constructing user interfaces.
Within these toolkits are functions which have a variable number of arguments, much like the standard printf system call. Unlike printf, these functions require a null entry to terminate the argument list. However, in the SGI development environment, these null entries are optional and SGI developers frequently forget to terminate the argument list with such an entry. This does not cause a problem on SGI-based systems, but if this same code is then compiled in a Linux environment, the resulting executable will almost certainly core dump upon execution. The fix is simply to add the missing null entries to the relevant calls.