The World's Funkiest Body Pose Tracker

The World’s Funkiest Body Pose Tracker

Our hands are the primary way that we interact with the world, whether we are driving a car, checking our messages on a smartphone, or turning on a light switch. So to build smarter, more interactive experiences with both our electronic devices and uninstrumented, everyday objects, we need to be able to determine exactly what someone is doing with their hands. Practical applications are myriad, and run the gamut from enabling realistic virtual reality experiences to sign language recognition and gesture detection. Solutions to the problem of hand tracking do exist, and some of them perform quite well, but traditional solutions tend to require lengthy setup processes, controlled environments, and expensive hardware.

The best performing systems typically rely on a system of trackers attached to the hand that are localized in three-dimensional space by a series of anchor devices that are installed around the perimeter of an area. Other systems use fixed cameras to track the hand, so long as it stays in view of the cameras. While these methods generally perform well, they are impractical for on-the-go use, and tend to be expensive to implement. For this reason, instrumenting the body has been explored, but these solutions have two significant drawbacks in that the sensors tend to protrude quite far from the body in a way that would be unacceptable to nearly all real world users, and they also frequently use cameras . Cameras may be fine for some applications, but few people want cameras pointed at themselves throughout their entire day.

A duo of engineers hailing from Carnegie Mellon University have developed a new approach to the problem of portable, continuous hand tracking that may do away with the impracticalities of other methods. Suffering from a serious bout of the boogie woogie fever, the team named their invention DiscoBand; fortunately the fever did not hinder their engineering skills. Their wristband-mounted device has a total of 16 low resolution (8 x 8 pixel) depth cameras. Half of the sensors are directed at the hand, while the other half have a view of the arm, upper body, and environment. This yields a total of 1,024 3D point measurements, which is sufficient to create a good picture of hand pose even when some of the sensors are occluded. For this reason, the wristband can sit close against the wrist. Moreover, the sparsity of the data collected from the 64 pixel cameras can produce nothing more than a very rough blob shape, which protects the privacy of the wearer.

The primary use cases the team wanted to support with DiscoBand were arm and hand tracking. Accordingly, they built a prototype of the wristband and worked out the details for processing the data it generates. To enable arm tracking, they used data from the eight outward-facing depth cameras. It was processed to extract the most relevant features, then they were forwarded into a machine learning regression model that predicts the location in three-dimensional space of the left wrist, left elbow, left shoulder, right shoulder, left hip, and right hip.

To validate this device, a study was conducted consisting of ten participants. They were asked to perform a variety of predefined arm poses. To capture ground truth measurements, a webcam was set up and MediaPipe Pose software was used to capture arm key points. These were compared with key points determined by DiscoBand, and a mean error of 5.88 centimeters was observed for all upper-body points.

In the future, the researchers hope to see their technology incorporated into existing smartwatches. From there, they envision it enabling many additional use cases, including ad hoc touch tracking and recognition of held objects.

Leave a Comment

Your email address will not be published. Required fields are marked *