Turning photos into an interactive experience
Human facial expressions, big and small, have the ability to convey what a person is feeling. Imagine being able to bring out a wide range of human emotions in any still photo, and to do so, automatically.
Computer scientists at Tel-Aviv University collaborated with researchers at Facebook to develop a new computational technique that makes it possible for users to animate still images in a highly realistic way. The method enables subjects in a still photo to come to life and express various emotions. For instance, when a user "likes" a photo on Facebook or tweets a thumbs-up, the facial expression of his or her profile photo could automatically react with a smile or express an emotion of approval or happiness. The researchers demonstrate their technique on varying facial images, including selfies, old portraits and facial avatars. Their results were able to show portraits brought to life, depicting a person in the photo as though they were actually breathing, for example, or smiling or frowning.
"Facial expressions in humans convey not just major emotions, but through subtle variations, a rather nuanced view into the emotional state of a person," said Hadar Averbuch-Elor, computer science PhD at Tel-Aviv University and a lead author of the study. "This expressiveness is what we attempt to capture and animate in our work."
Coauthors of the research, "Bringing Portraits to Life," include Daniel Cohen-Or, professor of computer science at Tel-Aviv University, and Facebook research scientists Michael F. Cohen and Johannes Kopf. The team will present their innovative work at SIGGRAPH Asia 2017 in Bangkok, 27 November to 30 November. This annual conference and exhibition showcases the world's leading professionals, academics and creative minds at the forefront of computer graphics and interactive techniques.
Given a single image, the researchers' method automatically generates photo-realistic video clips that express various emotions. Previous facial animation techniques typically assume the availability of a video of the target face, which exhibits variation in pose and expression. In this work, the researchers use as input only a single image of a target face to animate it.
"This makes our method more widely applicable, to the near endless supply of portraits or selfies on the Internet," explained Cohen. "We animate the single target face image from a driving video, allowing the target image—the still photo—to come alive and mimic the expressiveness of the subject in the driving video."
To this end, the target image in the still photo is animated by a series of geometric warps that imitate the facial transformations in the driving video. Previous work has so far been restricted to animating only the face region but, within limits, this new method can animate the full head and upper body. The researchers manipulate the face by lightweight 2D warps and are able to create moderate head movement that is depicted very realistically without having to convert or project the image to 3D. Technically, the main challenge, according to the researchers, was to preserve the identity of the person's face in the still photo while manipulating it with warps and features taken and transferred from frames of a driving video.
The new method also includes continuous transfer of fine-scale details, such as facial wrinkles and creases while avoiding so-called "outlier" wrinkles that are caused by cast shadows or misalignment between the warped video frames. When needed, their computational technique can also automatically fill in "hidden" facial details. For instance, the interior of the mouth can be depicted even if the person in the target photo has his/her mouth closed.
In a user study, participants were presented with 24 randomly selected videos, eight of which were real. They were asked to rate them as real or not real based on how the animation appeared meaning 50 percent would be a perfect score. The researchers' animated videos were identified as real 46 percent of the time. The "happy" animations were perceived as the most real (identified as real 58 percent of the time), and the "surprise" ones perceived as less real (identified as real 37 percent of the time).
In the future, the team intends to build on this work by combining their technique with 3D methods or coupling it with artificial intelligence to create an interactive avatar starting from a single photograph.