Linux Teleconferencing: Improving the Wireless Network

Izzet Agoren

Issue #85, May 2001

Izzet gives insight as to how industry juggernauts tackle the challenge of wireless teleconferencing.

A stated goal of the Third Generation cellular standards bodies (3G) is for the wireless network to function as a seamless extension to the Internet and other IP-based packet network services. The brunt of the ground-breaking work we still await has been passed from companies to standards bodies to the Internet Engineering Task Force (IETF). IP transparency extends a plethora of services to the mobile devices already available to those with a wire line to the Internet, as well as simplicity and ease of adding new and more creative services explicitly designed to enhance wireless experience. An obvious result is an increase in the demand for wireless minutes by customers of the 3G carrier members.

The bandwidth available for broadband data transmission is both restricted and costly. Licenses for the frequency spectrum needed to support 3G services have been auctioned to network operators in the UK for $36 billion. More recently the Federal Communications Commission (FCC) raised a net $17 billion in a similar auction to US-based operators. This prompts carriers to improve the efficiency with which they employ the spectrum. The use of IP as a transport mechanism for voice (e.g., VoIP) requires the wireless network to carry real-time multimedia, even as it struggles to do non-real-time multimedia. For example, a full rate voice encoder like the G721.1 used by GSM-type networks generates 30ms (milliseconds) of payload, straddled with 40 bytes of a combined IP, UDP and real-time protocol (RTP) header as prescribed in the H.323 International Telecommunications Union (ITU) protocol for the delivery of multimedia. The 30ms of speech payload typically translates to 20 bytes, thereby giving us an efficiency of only 33%.

Figure 1. Spectral Efficiency of a H.323 Packet

Header compression algorithms proposed to the IETF reduce the header down to one byte in steady state but have varying degrees of hobbling complexity. The header compression algorithm is required to present both bit-exact payload and an encapsulating IP protocol header to the mobile device. This is referred to as loss-less compression.

The benefits presented by header compression prove to be an imperative measure in increasing the spectral efficiency of multimedia communications, particularly since CRTP, proposed in the late 1990s by Cisco, is deemed insufficient, or not robust enough, by the involved steering committees of the IETF, as well as the Third Generation Partnership Project (3GPP).

The details of which header compression algorithm the IETF will choose have not been fully determined. Nonetheless, the requirements have been hammered out and are summarized in Table 1. In fact, the current proposals meet many of the requirements but are peppered with intellectual property rights—something that defeats the objective of the IETF. A second round of proposals are under consideration so that different methodologies may be solicited. The current algorithms exploit the nature of the IP/UDP/RTP streams. Tables 3, 4 and 5 classify the nature of the header field for each protocol IP, UDP and RTP respectively. Table 2 defines the classifications.

Table 1. RoHC Requirements

Table 2. Header Field Classifications

Table 3. IP Header Fields and Classifications

Table 4. UDP Header Fields and Classifications

Table 5. RTP Header Fields and Classifications

The Essence of Compression Decompression

The essences of these algorithms are the strategies borne from these classifications. They are “never send”, “communicate at least once”, “communicate at least once or update”, “communicate update and/or refresh frequently”, “guarantee continuous robustness”, “communicate as is in all packets and establish” and “be prepared to update delta”.

Telecommunications companies that have pursued this approach to compress packet headers range from Nokia, Matsushita and Cisco to, most notably, Ericsson, for their heavyweight effort. The full details of their proposed algorithm have been submitted in IETF draft form and can be found at http://www.dmn.tzi.org.org/ietf/rohc/.

The Solution

The effects of the wireless channel and the response of the compression algorithms can be modeled using a small collection of hardware and software, namely a Linux box, a pair of H.323 generators and some freely available libraries. The rather simple setup required to simulate all of this can be done with just three computers, a pair of speakers and microphones, a few extra Ethernet cards and two cross cables. The software is just as minimal, and when used in conjunction with the setup shown in Figure 2, we obtain a particularly easily reproducible system.

Figure 2. The Machine in the Middle Model

The open H.323 project is an excellent source of software components required for the generation of packets that comply with the standard. Two alternatives are available at the web site (http://www.openh323.org/) for both Linux and Windows platforms. For those of you who have Windows already, NetMeeting, commonly bundled with the operating system, offers yet another version of an H.323-compliant multimedia engine.

Important scenarios need factoring into an implementation in terms of simulating the channel, including:

Startup—establishing the session requires a great deal of robustness so that the compressor/decompressor pair can get into context as described in the previously mentioned strategies.
Handoff—handoffs are a function of how fast the mobile device is moving, and the number of packets dropped follow a Poisson distribution with parameter nine. This behavior is considered graceful in that third-generation networks allow the session to survive the handover of the session without having to restart.
Deep fade—these are the enemies of wireless communications and are caused by the hostile nature of spitting bits out on radio waves. Deep fades are typically attributed to the momentary shadowing of the radio signal as well the detrimental effect of interference experienced from congested areas. These are currently the major limiting factors under the academic microscope to the operation of all third-generation cellular networks.

Having installed the Linux machine in the middle of the network, we delve into the TPC/IP stack so that we can create the adaption functions in a modular, nonblocking, input/output, real-time, client/server fashion. In effect, we stop the packet flow by setting up rules in IP chains and using the libpcap interface to bring the stopped packets up to user space for analysis and, ultimately, compression/decompression. Having physically reproduced the architecture of Figure 2, it is a trivial matter to establish an uninterrupted teleconference session whose packets are forced through your Linux machine. IP forwarding must be enabled for this to function, particularly since we are going to stop the flow of packets with the use of IP chains. To verify that IP forwarding is indeed enabled, type:

echo /proc/sys/net/ipv4/ip_forwarding

The result is equal to one if we are ready to forward packets. The next step is to verify we can stop and restart the stream without killing the session. This means TCP packets must be uninterrupted. At this point, we need only concentrate on IP/UDP/RTP packets. The following commands will stop and restart the stream:

ipchains -P forward -DENY
ipchains -P forward -ACCEPT

The -P option is indiscriminate of protocol; it will stop ICMP, ARP, TCP and UDP packets.

Now that we can play with the stream we can be selective with which ones we transmit and how we transmit them.

Figure 3. The Linux TCP/IP Stack

The link layer (LL) is where we want to pick up our packets in order to retain all the IP fields. The packets are received by the networking stack and queued in a linked list structure called sk_buff where they are serviced automatically by the top-half software interrupts of the kernel in ip_input.c, ip_forward.c and ip_forward.c. For a more in-depth treatment of how socket buffers are managed in memory, see Alan Cox's “Network Buffers and Memory Management” (Linux Journal, October 1996). Most user-space programs interface the networking stack via Berkeley Packet Filter (BPF) or INET sockets. For security purposes, these socket interfaces were not designed to delve down to the Ethernet or device/physical layer (PL). A compromise is reached by opening a raw socket that retains the IP fields by interacting directly with the IP layer of the stack. Although reading packets in their raw form is supported by Libpcap, the ability to transmit them is only feasible through modifications to Libpcap itself. The definitive text on TCP/IP networking stack can be found in UNIX Network Programming, Vol.1, by W. Richard Stevens. For a more Linux-specific treatment of the subject, the interested reader is referred to David A. Rusling's “Chapter 10, Networks” in The Linux Kernel, which can be found at www.linuxhq.com/guides/TLK/net/net.html.

A succinct way of circumventing the necessary changes to either the kernel or libpcap in order to pop the raw packets in full, to and from the Ethernet device, is found in the form of a Perl5 CPAN package, namely RawIP (http://quake.skif.net/RawIP/). Figure 3 is a diagrammatic representation that maps the Linux TCP/IP stack and the code in the kernel (2.2.x) responsible for dealing with the packet flow to be compressed.

Listing 1 [at LJ's ftp site] is a rudimentary means, using Perl5, of picking up the stopped packets, outputting the IP id field contents and, in turn, passing them on to their final destination. The IP address of source and destinations are the only parameters required. The script is discriminate of the protocol, so we are now able to concentrate on just the voice or even, as it turns out, video packets.

The script creates a text file id_dump.txt that gives a handle on the IP id during any given session. Extending this to the other fields by quoting them in the script is the first stage needed in creating a state machine that implements any one of the proposals submitted to the IETF working group. Listing 2 uses the Math::Random Perl5 CPAN package to introduce an average packet or, in the case of VoIP, frame error rate according to a uniform distribution of 20%. The effects of which are immediately distinguishable when used in conjunction with the gateway of Listing 1, which will now begin to take the form of a high-level wireless channel simulator. The justification of deeming this a sufficient means of corrupting the stream is two-fold. A very precise 3G wideband code-division multiple access modulating channel model with multiple Raleigh fading paths is freely available for download from w3.antd.nist.gov/wctg/3G/3G.html. Its use, while it comes highly recommended, can have a tremendous impact on the real-time nature of the session in the absence of a Beowulf cluster. Furthermore, the delay will reduce the subjective nature of system performance that engineers often rely upon in the case of the perception of voice transmission quality.

Listing 2. Perl5 Frame Error Rate Thanks to Knuth

Now that you are armed with an all applicable framework, the next step would be to apply the work to the analysis of video packets and their resilience to our hostile channel or pack this into an embedded system for the creation of a low-cost teleconferencing tool.