Making computers smarter—with TikTok dance videos

Four differently colored versions of a man in a baseball cap and sweater, ranging from photo to gray virtual representation.

TikTok dance videos captured legions of fun-hungry fans during the Covid-19 lockdown. But U of M researcher Yasamin Jafarian found a deeper purpose for the viral phenoms.

For the last year, Jafarian, a doctoral student in computer science and engineering, has tapped the videos for the frame-by-frame building blocks she uses to construct lifelike 3D avatars of real people. Finding many of today’s 3D avatars cartoonish, she wants to replace them by using machine learning and artificial intelligence (AI) to generate more realistic avatars for use in future virtual reality settings.

To that end, she trains AI computers to understand visual data through images and video.  

Going Hollywood one better?

The movie industry produces lifelike avatars for film or video games through CGI (computer-generated imagery). But the industry can afford to take thousands of shots of performers.

“The problem with movie technology is that it’s not accessible to everybody,” Jafarian says. “I wanted to generate the same opportunity for the average person so they can just use their phone camera and be able to create a 3D avatar of themselves.”

Jafarian aimed to design an algorithm that needed only one photo or video of a person to generate a realistic avatar. That required a large dataset of videos to “train” the algorithm. TikTok dance videos—which often feature only one person, showing the full length of their body in multiple poses—filled the bill.

Real progress in virtual reality

After watching some 1,000 TikTok videos, Jafarian chose 340 for her dataset, each 10-15 seconds long. At 30 frames per second, that came to more than 100,000 images of people dancing.

So far, she has successfully used her algorithm to generate a 3D avatar of a person from the front view. She published her work and won a Best Paper Honorable Mention Award at the 2021 Conference on Computer Vision and Pattern Recognition.

Jafarian plans to keep refining the algorithm until it can generate an entire person’s body using just a few views. She hopes real people will one day use the technology to interact in virtual online social spaces, and not just via Zoom.

”We can have virtual environments, using VR goggles like Oculus, for example, where we can see and interact with each other,” she says. “If we can make those digital avatars realistic, it would make those interactions deeper and more interesting.”

Her research could also help all of us—that is, our avatars—try on clothes virtually, cutting down on trips to the store.

Read and watch a video on the original College of Science and Engineering story site.