Control Linux with a wave of your hand! Hook up a Kinect to your PC and help invent the user interface of the future.
The Minority Report has been in rotation on cable lately, and you've probably seen the futuristic vision of Tom Cruise standing in front of a large screen, manipulating information with waves of his hands. That vision is a bit closer to reality, thanks in part to the economies of scale of the game industry. I don't often have reason to sing the praises of Microsoft, particularly not in a magazine devoted to Linux and all things open. But one thing our friends in Redmond do very well is to commoditize hardware. They've done just that with the Kinect by creating it as a natural interface for the Xbox 360 game console. What's more, they've allowed open-source developers to create drivers for the device, and they've even allowed the third party who developed the technology, PrimeSense, Inc., to release its own device drivers for Linux, Windows and OS X.
Computer interfaces that use “natural” interaction have been termed, appropriately, natural user interfaces, or NUI. The implication is not so much that the interface itself is natural (let's face it, there is no natural, unlearned way to interact with a machine), but that the interface is very easy to learn, and in that sense natural. Natural user interfaces are a very active research topic, and a wide variety of prototypes have been demonstrated. The list of projects exploring natural interfaces and Kinect is ever-expanding. Here are some that have been ventured on the Net:
Robotic vision systems.
Interior room mapping.
Light control with gestures.
Sign-language computer interfaces.
Standardized gesture interfaces.
Music with gestures (including the floor piano from Toys).
Instruction (such as dance, karate and tai chi).
The Kinect is tightly packed with an array of sensors and specialized devices to preprocess the information received. Communication between the Kinect and the game console or Linux is through a single USB cable.
Several different estimates have been made of the cost to manufacture Kinect, ranging from about $56 to $150. Whatever the real cost, we have to hand it to Microsoft for reducing the cost of its original prototype (reportedly $30,000) by at least two orders of magnitude to create a viable commercial product.
You can view a complete teardown of the Kinect on YouTube (see Resources), and inside you will find the following:
A projector that projects a field of infrared beams, used to detect depth.
Two cameras (each 640x480): a monochrome infrared camera that is used to detect the array of reflected infrared beams (this is the information used to construct the image of depth) and a color camera that can be used to capture snapshots or as a Webcam.
A motor to tilt the sensor array up and down for best view of the scene.
An accelerometer to sense position.
Camera interfaces and preprocessors for the two cameras.
512MB of DDR memory and 8MB of Flash.
The infrared beams are encoded, and as they are reflected off surfaces in front of the sensor array and detected by the infrared camera, a preprocessor in the Kinect calculates the distance from the array to the reflecting surfaces. Proprietary software can use that information to identify likely humans in the scene and the likely positions of their arms and legs.
The short history of the Kinect and open source is interesting, instructive, and it demonstrates the awesome power of open source:
On November 4, 2010, Microsoft released the Kinect in the United States.
That same day, Adafruit Industries announced it would pay anyone $1,000 for an open-source driver for the device on Windows or any other operating system.
Hours later, Microsoft announced that it would not condone hacking of any of its devices. Adafruit responded by raising the bounty to $2,000.
By Saturday, November 6, a hacker going by the name AlexP claimed to have hacked the interface to the Kinect's motor. Microsoft said it wasn't true. Adafruit raised the reward to $3,000.
By that Monday, AlexP had posted video proving his ability not only to control the motor, but also to interface to the depth perception and color camera on the Kinect. He could have chosen to release the code and collect the prize, but instead, that Tuesday, he posted a message saying he would release the code if $10,000 were contributed to fund his further work.
Tuesday evening, Adafruit released a large dataset recorded by a USB analyzer watching the data stream between the Kinect and Xbox. Hackers worldwide started using the dataset to work out the details of the interface.
On Wednesday, the Kinect was released in Europe. Hector Martin, near Bilbao, Spain, purchased one that morning, and using the Adafruit published data, was able to get it connected to his PC by noon. The results were published on the libfreenect site, and the prize was won.
But the story doesn't stop there. By Friday of the same week, Microsoft reconsidered its position on the open-source drivers. In a remarkable bit of semantic derring-do, Microsoft said the Kinect had not been hacked by its definition, and actually praised the developers expanding the use of the device.
In just a few days, the Open Source community had seized the opportunity provided by Microsoft and created an open-source alternative that created an explosion of possibilities for research into natural interfaces.
The Adafruit contest had been hosted on a GitHub site called libfreenect. Once the contest had been won, Josh Blake and some others founded a community called OpenKinect. From their Web site (see Resources):
OpenKinect is an open community of people interested in making use of the amazing Xbox Kinect hardware with our PCs and other devices. We are working on free, open-source libraries that will enable the Kinect to be used with Windows, Linux and Mac.
The OpenKinect community consists of over 2,000 members contributing their time and code to the Project. Our members have joined this Project with the mission of creating the best possible suite of applications for the Kinect. OpenKinect is a true “open source” community!
Our primary focus is currently the libfreenect software. Code contributed to OpenKinect where possible is made available under an Apache20[sic] or optional GPL2 license.
Around the same time, a number of companies with interests in commercializing natural interfaces formed a group called OpenNI. According to their Web site (see Resources):
The OpenNI organization is an industry-led, not-for-profit organization formed to certify and promote the compatibility and interoperability of Natural Interaction devices, applications and middleware. One of the OpenNI organization's goals is to accelerate the introduction of Natural Interaction applications into the marketplace.
OpenNI offers its software under several different licenses—LGPL for the open-source bits and just binaries for some proprietary parts (like skeletal identification). The founders of OpenNI include:
PrimeSense, Inc., the company that supplied Microsoft with the technology for “Project Natal”, which became the Kinect.
Willow Garage, a company focused on hardware and software for personal robotics.
SideKick, a game software company that develops motion-based games.
ASUS, the computer OEM, who is selling a different 3-D depth sensor based on PrimeSense technology called X-tion Pro.
In general, OpenKinect appears to be approaching NUI from the bottom up, starting with the libfreenect driver and building on top of that, all strictly open source. OpenNI has more of an architectural vision for how NUI devices from different vendors can interoperate. There's plenty of room for the two organizations to work together.
The drivers and demo software for Kinect are readily available, both from OpenNI and from openkinect.org (see Resources). Installation on Ubuntu 10.10 is particularly easy, as both organizations provide prebuilt packages. If you're running a different Linux distro, RPMs and debs also are available, along with ample build instructions, so you shouldn't have a problem.
With the OpenKinect packages, you get some demo programs that show the depth perception and color image capability of the Kinect. Utilities are included to record the Kinect data stream and to emulate a Kinect so the software can be used without having the hardware connected. There also are language interfaces in various stages of development for the following:
Synchronous C interface (functions instead of callbacks).
The OpenNI package has similar capabilities, and it includes detailed documentation of the C/C++ interface to the underlying layers. The OpenNI documentation and sample code is oriented toward Visual Studio, but most of it also is applicable to gcc.
Okay, wrong movie, but there are some not-so-obvious realities to using Kinect with Linux. They might or might not affect your explorations:
The USB connector on the Kinect device is nonstandard. That's okay if you buy the Kinect as a standalone device. It comes with a power supply and an adapter to use standard USB. If you buy the Kinect as part of an Xbox bundle, you will need to buy the power supply/adapter separately.
The depth perception technology in Kinect works best at distances of 6–8 feet. It's not a mouse-and-keyboard distance interface, it's a stand-across-the-room-and-wave interface.
The Kinect software is able to determine body position and infer the positions of arms and legs, but it isn't able to do things like individual finger resolution. That makes it difficult to do things like sign-language interpretation.
So, now for $150 you can have open-source access to hardware that would have cost you $30,000 a few years ago. What clever ideas can you come up with for NUI?