ILM says they have rarely seen artists get excited by hardware, but artists fought to get the new Linux workstations—Dell single-CPU P4s with NVIDIA Quadra 2 Pro graphics cards. The question became, “Where's my Linux box?”
Production Engineering Manager Ken Beyer says
More than 350 Linux boxes were deployed during Episode II. Animators and modelers got their workstations first, then compositors. The first group had flat panels because animators lack the desk space for monitors. There were problems with monitor calibration under Red Hat 7.1. We used flat panels to get Linux out there. Last to get workstations were TDs. They push the envelope of what they ask for. An issue was how quickly we could get things ported for them.
“We've changed over quite a bit of our plant here to Linux—half of our desktops and about 30% of our 2,000 CPU renderfarm is now Linux”, says ILM Director of Research and Development Andy Hendrickson. “We've got 700-plus O2 machines”, adds Beyer. “But it isn't affordable to replace those with Octanes.” SGI is recognized for producing high-end workstations and servers but has abandoned competing with commodity PC hardware. SGI seems to be rebounding in the military market but less so in entertainment.
“Our renderfarm towers carry the Deathstar logo”, points out Beyer. A render tower is a stack of 1RU 2-CPU units connected together with inexpensive 100Base-TX. He says:
These are 1RU, 2-CPU P4 units. If we lose a unit it is more convenient now that it is just two CPUs rather than four or eight with SGI 2800. For Episode II we had to double available capacity and power. It's 512 processors. We use dual 225 kVA UPS systems, and have three AC systems that rotate. Power goes out often in the San Rafael area. We can run on UPS for 15 minutes then [on a] diesel generator.
An unexpected snag arose during the upgrade: all the PC fans had to be replaced because they were defective.
Systems R&D Group Manager Mike Kiernan reports a few problems with Linux:
Sometimes when I arrive in the morning a quarter of the Linux cluster is locked up. Fortunately, it doesn't happen too often. VM problems in the 2.4 kernel appear to be at the root of our kernel lockup problem. Recent improvements in the 2.4 kernel may resolve that. Things look promising.
But he adds that “Linux needs work on NFS big time.”
We won't be going to Linux for our NFS servers. I wish we could replace NFS, but none of the document management systems is flexible enough. And the ones that are flexible have a rather high integration cost. When AFS is distributed natively for all the client platforms we need to support, perhaps we'll consider it.
ILM is comfortable with multiple platforms. Its 1,400 employees use a variety of operating systems. The art department has Macs, with the rotoscopers and painters transitioning to OS X. Hendrickson sees OS X as a possible player. “What attracts us is the BSD-like Darwin core and network compatibility.” ILM has few Windows boxes, besides those on business side. “There's no advantage to a Windows conversion for us”, says Hendrickson. “We're a UNIX shop and probably always will be.”
R&D Principal Engineer Phil Peterson says ILM chose the Red Hat distro because it seemed easier to go with what's popular. “At ILM the 2.4.9 kernel is deployed, and 2.4.17 or 2.4.18 is in test. We tweak the kernel—things like shared memory size, number of file descriptors, default stack size—nothing dramatic.” Open Motif 2.1 did a good job maintaining the look-and-feel of IRIX, so ILM didn't try LessTif. ILM workstations include limited installations of GNOME and KDE. “No special effort was spent to strip machines down”, says Peterson. “We just left out unused portions of the full install. We're pretty vanilla.”
An unusual aspect of the ILM Linux workstation configuration is the replacement of the MESA libs with the SGI open source, OpenGL implementation. “MESA is behind compared to the SGI version in aspects such as libGLU”, explains Peterson. Other studios haven't experienced the best stability using Maya on Linux with NVIDIA drivers. It seems that may be due to MESA and not Maya, NVIDIA or Linux, as previously thought. ILM has replaced the MESA libraries with a combination of NVIDIA's core OpenGL and libraries from the SGI open-source sample implementation.
“Chances are you will not find solutions in any documentation”, notes Peterson.
We don't have a support line to call. We fix things and extend. It introduces a layer of maintenance we're not used to. We had to use open-source drivers with tablets. With calibrating monitors, the work is ongoing. Still, we've had an easy road. Our artists are technically savvy, able to endure pain. Having the best testers in the world around the corner from you provides quick feedback.
Hendrickson concurs that Linux support can be a problem. He says, “As we get into Linux we're not finding one company to hand-hold. IBM and HP aren't there, yet. But, before Linux it was out of our control and out of control. [Now] we own our Linux problems.”
Is it possible for Linux to be too fast? “Due to the speed of Linux, for the first time in my life, 15 years in the business, I'm starting to feel some RSI [repetitive strain injury]”, says Technical Director Robert Weaver. “Usually you are working the machine, but Linux is so fast it can overwork you.” Weaver has to remember to take breaks because with Linux he doesn't get any breaks waiting for the machine anymore.