Forget the TV remote and the games controller, now you can control anything from your mobile phone to the television with just a wave of your hand.
Researchers at Newcastle University and Microsoft Research Cambridge (MSR) have developed a sensor the size of a wrist-watch which tracks the 3-D movement of the hand and allows the user to remotely control any device.
Mapping finger movement and orientation, it gives the user remote control anytime, anywhere – even allowing you to answer your phone while it's still in your pocket and you're walking down the street.
Being presented this week at the 25th Association for Computing Machinery Symposium on User Interface Software and Technology, 'Digits' allows for the first time 3-D interactions without being tied to any external hardware.
It has been developed by David Kim, a MSR funded PhD from Newcastle University's Culture Lab; Otmar Hilliges, Shahram Izadi, Alex Butler, and Jiawen Chen of MSR Cambridge; Iason Oikonomidis of Greece's Foundation for Research & Technology; and Professor Patrick Olivier of Newcastle University's Culture Lab.
"The Digits sensor doesn't rely on any external infrastructure so it is completely mobile," explains David Kim, a PhD student at Newcastle University.
"This means users are not bound to a fixed space. They can interact while moving from room to room or even running down the street. What Digits does is finally take 3-D interaction outside the living room."
To enable ubiquitous 3-D spatial interaction anywhere, Digits had to be lightweight, consume little power, and have the potential to be as small and comfortable as a watch. At the same time, Digits had to deliver superior gesture sensing and "understand" the human hand, from wrist orientation to the angle of each finger joint, so that interaction would not be limited to 3-D points in space. Digits had to understand what the hand is trying to express—even while inside a pocket.
David explains: "We needed a system that enabled natural 3-D interactions with bare hands, but with as much flexibility and accuracy as data gloves."
The current prototype, which is being showcased at the prestigious ACM UIST 2012 conference today, includes an infrared camera, IR laser line generator, IR diffuse illuminator, and an inertial-measurement unit (IMU) track.
David says: "We wanted users to be able to interact spontaneously with their electronic devices using simple gestures without even having to reach for them. Can you imagine how much easier it would be if you could answer your mobile phone while it's still in your pocket or buried at the bottom of your bag?"
It's All About the Human Hand
One of the project's main contributions is a real-time signal-processing pipeline that robustly samples key parts of the hand, such as the tips and lower regions of each finger. Other important research achievements are two kinematic models that enable full reconstruction of hand poses from just five key points. The project posed many challenges, but the team agrees that the hardest was extrapolating natural-looking hand motions from a sparse sampling of the key points sensed by the camera.
"We had to understand our own body parts first before we could formulate their workings mathematically," Izadi explains. "We spent hours just staring at our fingers. We read dozens of scientific papers about the biomechanical properties of the human hand. We tried to correlate these five points with the highly complex motion of the hand. In fact, we completely rewrote each kinematic model about three or four times until we got it just right."
The team agrees that the most exciting moment of the project came when team members saw the models succeed.
"At the beginning, the virtual hand often broke and collapsed. It was always very painful to watch," Kim recalls. "Then, one day, we radically simplified the mathematical model, and suddenly, it behaved like a human hand. It felt absolutely surreal and immersive, like in the movie Avatar. That moment gave us a big boost!"
Digits isn't meant to be a general-purpose interaction platform, but to prove the utility of the technology, both the Digits technical paper being presented at UIST 2012 and accompanying video present interactive scenarios using Digits in a variety of applications, with particular emphasis on mobile scenarios, where it can interact with mobile phones and tablets. The researchers also experimented with eyes-free interfaces, which enable users to leave mobile devices in a pocket or purse and interact with them using hand gestures. Another exciting application area for Digits is in gaming. Currently, gaming systems on the market do not support hand sensing at a high level of fidelity. Because of the technical challenges in sensing a full 3-D hand pose, most systems constrain the problem by limiting hand tracking to 2-D input only or by supporting interaction through surfaces and other tangible mediators. Digits could be complementary to these existing sensing modalities; one option could be to combine Kinect's full-body tracker with Digits' high–fidelity freehand interaction.
"By understanding how one part of the body works and knowing what sensors to use to capture a snapshot," Izadi says, "Digits offers a compelling look at the possibilities of opening up the full expressiveness and dexterity of one of our body parts for mobile human-computer interaction."
By instrumenting only the wrist, the user's entire hand is left to interact freely without wearing data gloves, input devices worn as gloves, most often used in virtual reality applications to facilitate tactile sensing and fine-motion control. The Digits prototype, whose electronics are self-contained on the user's wrist, optically image the entirety of the user's hand, enabling freehand interactions in a mobile setting.