Team creates automated method to assemble story-driven photo albums
Taking photos has never been easier, thanks to the ubiquity of cell phones, tablets and digital cameras. But editing a mass of vacation photos into an album remains a chore. A new automated method developed by Disney Research could ease that task while also telling a compelling story.
The method developed by a team led by Leonid Sigal, senior research scientist at Disney Research, attempts to not only select photos based on quality and relevance, but also to order them in a way that makes narrative sense.
"Professional photographers, whether they are assembling a wedding album or a photo slideshow, know that the strict chronological order of the photos is often less important than the story that is being told," Sigal said. "But this process can be laborious, particularly when large photo collections are involved. So we looked for ways to automate it."
Sigal and his collaborators will present their findings at WACV 2015, the IEEE Winter Conference on Applications of Computer Vision, Jan. 6-9, in Waikoloa Beach, Hawaii. Others involved include Disney Research's Rafael Tena, Fereshteh Sadeghi, a computer science graduate student at the University of Central Florida and Ali Farhadi, assistant professor of computer science and engineering at the University of Washington.
The team looked at ways of arranging vacation photos into a coherent album. Previous efforts on automated album creation have relied on arranging photos based largely on chronology and geo-tagging, Sigal noted.
But when four people were asked to choose and assemble five-photo albums that told a story, the researchers noted that these individuals took photos out of chronological order about 40 percent of the time. Subsequent preference testing using Mechanical Turk showed people preferred these annotated albums over those chosen randomly or those based on chronology.
To create a computerized system capable of creating a compelling visual story, the researchers built a model that could create albums based on variety of photo features, including the presence or absence of faces and their spatial layout; overall scene textures and colors; and the esthetic quality of each image.
Their model also incorporated learned rules for how albums are assembled, such as preferences for certain types of photos to be placed at the beginning, in the middle and at the end of albums. An album about a Disney World visit, for instance, might begin with a family photo in front of Cinderella's castle or with Mickey Mouse. Photos in the middle might pair a wide shot with a close-up, or vice versa. Exclusionary rules, such as avoiding the use the same type of photo more than once, were also learned and incorporated.
The researchers used a machine learning algorithm to enable the system to learn how humans use those features and what rules they use to assemble photo albums. The training sets used for this purpose were created for the study from thousands of photos from Flickr. These included 63 image collections in five topic areas: trips to Disney theme parks, beach vacations and trips to London, Paris and Washington, D.C. Each collection was annotated by four people, who were asked to assemble five-photo albums that told stories and to group images into sets of near duplicates.
Once the system learned the principles of selecting and ordering photos, it was able to compose photo albums from unordered and untagged collections of photos. Sigal noted that such a system also can learn the preferences of individuals, in assembling these collections, to customize the album creation process.