The researchers wrote a paper about their technique, and they plan to present their findings at a computer graphics and interactive techniques conference next month.
“The idea is to use the technology for better communication between people,” says Ira Kemelmacher-Shlizerman, a co-author of the paper and an assistant professor in the department of computer science and engineering at the University of Washington. She thinks this technology could be useful for video conferencing—one could generate a realistic video from audio, even when a system’s bandwidth is too low to support video transmission, for example. Eventually, the technique could be used as a form of teleportation in virtual reality and augmented reality, making a convincing avatar of a person appear to be in the same room as a real person, across any distance in space and time.“We’re not learning just how to give a talking face to Siri, or to use Obama as your GPS navigation, but we’re learning how to capture human personas,” says Supasorn Suwajanakorn, a co-author of the paper. Not surprisingly, several major technology companies have taken notice: Samsung, Google, Facebook, and Intel all chipped in funding for this research. Their interest likely spans the realms of artificial intelligence, augmented reality, and robotics. “I hope we can study and transfer these human qualities to robots and make them more like a person,” Suwajanakorn told me.
Quite clearly, though, the technique could be used to deceive. People are already fooled by doctored photos, impostor accounts on social media, and other sorts of digital mimicry all the time.
Imagine the confusion that might surround a convincing video of the president being made to say something he never actually said. “I do worry,” Kemelmacher-Shlizerman acknowledged. But the good outweighs the bad, she insists. “I believe it’s a breakthrough.”
There are ways for experts to determine whether a video has been faked using this technique. Since researchers still rely on legitimate footage to produce portions of a lip-synched video, like the speaker’s head, it’s possible to identify the original video that was used to create the made-up one.
“So, by creating a database of internet videos, we can detect fake videos by searching through the database and see whether there exists a video with the same head and background,” Suwajanakorn told me. “Another artifact that can be an indication is the blurry mouth [and] teeth region. This may be not noticeable by human eyes, but a program that compares the blurriness of the mouth region to the rest of the video can easily be developed and will work quite reliably.”
It also helps if you have two or more recordings of a person from different views, Suwajanakorn said. That’s much harder to fake. These are useful safeguards, but the technology will still pose challenges as people realize its potential. Not everyone will know how to seek out the databases and programs that allow for careful vetting—or even think to question a realistic-looking video in the first place. And those who share misinformation unintentionally will likely exacerbate the increasing distrust in experts who can help make sense of things
“My thought is that people will not believe videos, just like how we do not believe photos once we’re aware that tools like Photoshop exist,” Suwajanakorn told me. “This could be both good and bad, and we have to move on to a more reliable source of evidence.”
But what does reliability mean when you cannot believe your own eyes? With enough convincing distortions to reality, it becomes very difficult to know what’s real.