LJ Archive CD

WIX: a Distributed Internet Exchange

Richard Hulse

Issue #135, July 2005

A city-wide Ethernet opens up a world of possibilities, from letting companies share data across sites cheaply to saving ISPs from crushing demand when a local site hosts a Webcast all the users want.

Wellington, the capital city of New Zealand, has one of the oldest and possibly largest distributed Internet exchanges in the world. It is built on top of the Citylink public LAN infrastructure. In this article, we look at how it all started and the part Linux plays in its day-to-day operation.

History

Back in the 1980s, Richard Naylor, then IT manager for the Wellington City Council, was stuck with a common but difficult problem—an over-utilized VAX cluster at one data centre and an under-utilized one at another centre across town. He came up with an idea that was unique for the time: run a fibre optic cable between the two council buildings and share processing resources that way. The idea was so new that a jointer had to be flown down from Auckland to do the splices on the now-ancient slotted core cable.

Naylor set up a 10Mbps Ethernet connection using DECBridges, and the network itself was running DECnet on a 10-base5 and a little 10-base2. Terminals comprised most of the load at the time, as PCs didn't network. The overall idea worked, and the system was upgraded to 10-baseT in 1989, with IP being added.

In the early 1990s, Naylor's idea caught the attention of then Mayor Fran Wilde, who was intrigued by what Naylor and colleague Charles Bagnall had been up to in what was called “their spare time”. Mayor Wilde had attended a local secondary school production and noticed that it was being streamed live on the Internet by an off-duty Naylor.

After talking with Naylor and Bagnall, Wilde came to understand what was possible. Their design and the use of fibre could be used to provide a broadband infrastructure for the city, and that plan became a key part of a 25-year strategy for Wellington City.

Soon after, 17 investors, including the council, came up with $5,000 each, and three drums of fibre optic cable were run from one end of Wellington City to the other. The cable was run along overhead trolley bus electric support cables during November and December 1996. The primary aim was to provide an infrastructure to enable greater growth within the local business community.

In the first few years, Citylink expanded at a rate of 100% each year—doubling the number of connected buildings. At one point, the team connected 50 new buildings in ten weeks.

In 1997, Naylor left the council and helped set up Citylink as a separate company. The first customers were government departments and financial businesses, as central government is located in Wellington. Later customers included publishers and IT companies. ISPs were there from the start too, with many using the fibre infrastructure as a way of providing genuine broadband to city customers at a low cost. I should note here that Citylink does not consider domestic 1Mb and 2Mb connections to be broadband; it prefers to start at 10Mb. Citylink can provide 10Mb, 100Mb or 1,000Mb connections anywhere on the fibre infrastructure.

Nearly ten years after the initial cable was positioned, around 50 kilometres (30 miles) of cable now exists within the central business district of Wellington. More than 300 buildings are connected.

The Infrastructure

The Citylink fibre network originally was interconnected with a bunch of oddball switches and hubs from various vendors. Over time, these have been phased out and replaced with Cisco switches, mainly 35xx and 29xx, on a gigabit core.

Citylink now offers two major services, dark fibre and Ethernet. Dark fibre services allow point-to-point connectivity between buildings. Each customer has sole use of his or her own strand of fibre and can connect whatever gear is required on either end. Dark fibre runs up to 1Gb/s at present. The Ethernet services are the most widely used and allow customers to connect to a city-wide “shared Ethernet”.

INL, a newspaper publisher, was the first Citylink customer. The company used dark fibre to link its corporate office to its Wellington office, and it used Ethernet to link back to the Wellington City Council offices. From there, a microwave link to Victoria University of Wellington provided access to the public Internet. At that point, Citylink was involved in providing only basic layer 2, Ethernet infrastructure.

Routing 101

The Internet exchange runs on top of the fibre Ethernet infrastructure. Before we look at this in detail, let's briefly run through a few routing basics. A router on a network receives packets of data, each with its own destination address. The router checks its internal list of addresses—the routing table—to see if there is a route or path to that destination. If there is, the packet is sent to the specified address. This might be the final destination, or it could be a gateway—another router that repeats the process and sends the packet on once again.

A traditional Internet exchange has participants connecting to a single location, and each participant has a router. One side of the router is connected to all of the other routers by way of a common Ethernet backbone. The other side is connected to the participant's own network infrastructure (Figure 1).

Figure 1. One set of route servers—IPv4 (Soekris 4801) on the left and IPv6 (Soekris 4501) on the right.

Each participant has a list of IP addresses that represent the networks it can access. Each router has its own IP address on the common backbone. These addresses often are private to the outside world and act as gateways to each participant's network.

As a point of interest, the first Citylink routers in general use were based on standard PC components with LS/120 floppy or disk-on-chip boot mechanisms running the Linux Router Project (LRP). These were deployed only for customers using wireless access points.

Enter BGP

Border Gateway Protocol (BGP) is fundamental to the operation of the Internet, because it automates the sharing of routes by a process called advertising. Adjacent routers establish a BGP session and advertise their networks to the other routers. It is a bit like one router saying to another, “Here is my IP address. If you have traffic with any of the following destination addresses, send it to me.” BGP doesn't stop there, though. Routers also share information they have gathered from other routers. They say, “I also know how to get traffic to some other networks. Send that to me as well.”

Typical Internet exchanges are different from the rest of the Internet and use Internal Gateway Protocol (IGP) instead. This protocol does not pass on information to other routers—it only advertises routes within the networks to which it is connected. The router says, “You can send me traffic for these addresses, but you can't pass that address information on to any other routers.”

The Citylink Exchange

In 1998, Citylink started work on deploying a BGP/4-based Internet exchange on top of the public Ethernet. One of the key design criteria was a low cost of entry. So the Citylink team took a step back and looked at the exchange problem from a logical point of view: what were the core requirements of an exchange, and how could they use the existing fibre infrastructure to create an environment that would encourage as many people as possible to join? Exchanging traffic via an exchange is called peering.

A key component of Citylink's approach involved placing routers at a customer's point of connection to the fibre infrastructure rather than in a central location. Not a lot of options were available for BGP/4-capable routers, though—you pretty much bought a Cisco or a Cisco (Figure 2).

Figure 2. The Citylink fibre (yellow cables) is converted to copper (blue) for distribution to clients.

Figure 3. A typical fibre switching point, as found in many PABX rooms around Wellington.

Because the exchange was distributed, Citylink needed a low-cost mechanism to allow people to peer, and so they started using Zebra on Linux as an alternative routing platform to Cisco.

Because of the limited space in clients' cabinets and the need to be able to buy the same hardware over a reasonable period of time, Citylink now has standardized on two systems. Boards from Soekris Engineering (4501 and 4801) are used for routers up to 30Mb/s, while Nexcom P4 GigE and VIA 100Mb boards are used for 100Mb/s connections and upwards to 300–500Mb/s on a good day. The Linux Embedded Appliance Firewall (LEAF) forms the basis of all software used on Citylink routing hardware.

Customers can use the Citylink routers or whatever suits their needs and budgets. A wide variety are in use—from Dlink and Xyzel at the low end to Juniper and Cisco boxes at the high end. A lot of people also use their own Linux and *BSD boxes. Using PC-based routers does limit the options for network interfaces, but this is not an issue for Citylink, which uses only Ethernet.

In addition to devolving hardware to client's premises, the routing tables for the whole exchange are managed by way of route servers, the operation of which I explain later in this article.

The new exchange was dubbed WIX, the Wellington Internet Exchange. There is a fixed monthly charge for connection, and traffic over the fibre network is free. Because the network often runs right past a potential user's door, it is easy for anyone to connect and peer. And this is exactly what has happened—even end users who could never peer under a traditional model now can. For all exchange users, access to the global Internet still necessitates the purchase of “transit”—access to the global Internet routing table—from at least one ISP.

The open peering approach has made a huge impact on traffic flows and latency. When the Internet first started in New Zealand, the country's single exchange point, at Waikato University, was also the international gateway for the whole country. Two businesses in Wellington wanting to exchange data would send it to their respective ISPs, who sent it upstream to Waikato. The path time for this typically ran 50–200ms. When ISPs began to exchange data directly in Wellington, this rate dropped to 20ms. End-user peering reduced this by another factor of ten, to 2ms.

In addition to shorter and faster network paths, no charges are levied for traffic sent over Citylink by way of the exchange. ISPs never see the traffic, because the exchange directs it through the shortest path to the requesting party.

A number of printing companies peer at WIX and exchange huge publishing files at no cost. One local newspaper runs an FTP server on WIX to which pre-press companies can upload files. Some graphics houses also run media stores for their clients, and these can browsed at no cost, as though they were part of the client's own LAN. These are only a few of the ways that the fibre infrastructure and WIX have helped local businesses to lower costs, expand and innovate.

Figure 4. Cables coming off overhead trolley bus supports to a junction point on a building veranda. The ability to exchange huge files at no cost helps local businesses save money and offer new services.

The distributed exchange environment has been likened to a market square—anyone can trade his or her wares with no cost for participation. Not everyone chooses to peer openly though, and the exchange supports all types of participation. A number of ISPs exchange traffic only with established customers and ignore any other traffic. Some simply use WIX as a neutral point to exchange data with one other party.

About 130 entities are peering at WIX, with close to 1,000 using the Citylink fibre for private purposes or public Internet connectivity. Encouraging peering comes with a downside though—a lot of participants means a lot of routing tables need to be managed.

The Many Uses of Linux

In order to make route changes simpler for the hundreds of WIX users, Citylink has deployed two route servers. Rather than having to peer with every router on WIX, the preference is to peer with only the route servers, dramatically simplifying route table maintenance for everyone. Each client router, then, has to maintain only two BGP sessions—one to each of the two route servers rather than to hundreds of routers.

The route servers (Figure 1) don't carry any traffic at all. They simply reflect routes from one peer to all of the rest. I was surprised to find that the servers are based on the same small footprint diskless 266MHz Pentium board made by Soekris Engineering that is used for the routers. LEAF is installed on compact Flash for reliability and fast booting. The Quagga Routing Software Suite is used to provide BGP services, and the kernel handles packet routing.

Two routers are used for redundancy, for IPv4, and a second set provides services for IPv6. Citylink maintains its own route registry and uses the Routing Policy Specification Language (RPSL) to manage the IPv4 routers. A set of shell and Perl scripts has been developed that use RTCONFIG to construct configuration files for the Quagga software. All this gives Citylink tight control over what peers can announce to the servers and ensures that a replacement can be deployed quickly if required. The whole process is managed with Revision Control System (RCS) to allow backups to be made and to ensure consistency.

Participants on WIX may advertise only those address spaces that are within their own individual network boundaries. The route servers re-advertise what they learn and filter out addresses that should never be routed, which are called Bogons. Examples of these are the loopback address, 127.0.0.1/8; other addresses allocated for private networks, such as 172.16.0.0/12; and non-assigned addresses.

At present, the IPv6 routers are maintained manually. When RPSL for IPv6 is standardised, however, and the amount of work required increases sufficiently, scripts will be used.

Anycast but Not Anywhere

One interesting technique deployed on the WIX and its sister exchange, APE, in Auckland, is anycast routing. A good example of anycast routing is the recent addition of a mirror of a root nameserver at WIX. Because of the way BGP and anycast works, a query to the root server goes to the nearest mirror automatically. If an ISP peers at WIX, it can get a 2ms ping time. International paths are over 200ms, so this a huge improvement.

Local media companies also use anycasting to provide content on the exchange at a low marginal cost to ISPs. Rather than having to bring content requested by their customers across expensive international circuits, the ISPs can get it locally.

Anycast also can be used to limit the distribution of the traffic to only local networks. One example of this is The Return of the King premiere parade, which was Webcast using anycast routing from downtown Wellington. Over five hours, about 12 terabytes of data was requested by New Zealand customers, and this content was provided at no cost to ISPs. A mirror of the stream also was provided from a server in the USA for international viewers. Another example is the provision of software mirrors such as one for Debian, the distribution used by Citylink.

One of Citylink's biggest innovations is the provision of wireless Internet connections in cafés and some business premises in Wellington. The first access point was installed in June 2002, and the service was launched officially in November of that year. Currently, more than 200 access points are in operation (Figure 5).

Figure 5. A CafeNet access point. The antenna is the small white rod on the right side of the box.

One good example of how this Wi-Fi technology can benefit the community is the recent installation of a wireless access point at the Mary Potter Hospice for terminally ill patients. Two laptops on mobile trolleys are used to allow patients to stay in touch with their families or simply to read material on-line in their own time.

Summary

Citylink also operates the Auckland Peering Exchange (APE), which has about 40 peering participants. A recent addition was the Palmerston North Internet Exchange (PNIX), which, although it has only one participant right now, serves as a place for content providers to mirror servers. Other exchanges are planned for other parts of New Zealand in the near future.

Citylink has found that Linux readily is adaptable to whatever the company needs it to do. Having “intelligent” routers running Linux has meant that deeper firewalls can be deployed; it also has given staff access to better debugging tools. For example, TCP Dump can be used to examine traffic through a router in real time if required.

The use of Linux and other open-source software has been a key enabler in creating an affordable public Ethernet and a low cost-of-entry distributed Internet exchange. It will continue to allow the number of exchanges to grow and to fuel innovation and collaboration in other centres around New Zealand.

Resources for this article: /article/8265.

Richard Hulse is a broadcaster based in Wellington, New Zealand. He currently is working on a number of IT projects that involve using Linux to bridge between disparate systems.

LJ Archive CD