November 16, 2015

A program that captions your photos

by Ecole Polytechnique Federale de Lausanne

Two researchers at Idiap, a research institute in Martigny that is affiliated with EPFL, developed an algorithm that – unlike systems recently unveiled by Google and Microsoft – can describe an image without having to pull up captions that it has already learned. To do this, the researchers used a program capable of making vector representations of images and captions based on an analysis of caption syntax.

"When we give it a photo, the program compares the image vector to the vector of possible words and selects the most likely noun, verb and prepositional phrases," said Rémi Lebret, a PhD student specializing in Deep Learning at Idiap. This is how the system finds the most likely description for a photo of a man skateboarding, for example, even if it has never seen a similar photo previously. The computer breaks down the picture into elements ("a skateboard, a man, a ramp") and verbs that could describe the action (" riding") before captioning the picture.

Getting it right

This approach is unlike existing ones. "Those other systems propose the first word based on the photo and then use that word to predict subsequent ones," said Pedro Oliveira Pinheiro, the other Idiap researcher on this project. Those algorithm based on sequence labeling with recurrent neural networks can cause problems, however, because if it poorly predicts the start of the phrase, the entire caption will necessary be wrong. Those systems also have a longer learning curve, and they tend to recycle previously used captions.

The technology developed by Pinheiro and Lebret is simpler and works better. And it has piqued the interest of social media. The two researchers did a six-month research internship at Facebook, which is drawing on their work to develop its own model of automatic captions meant in part for the visually impaired. The two researchers believe that their algorithm could be improved in the future through the use of more complex language models and by linking it to larger databases.

More information: Phrase-based Image Captioning. arxiv.org/abs/1502.03671

Provided by Ecole Polytechnique Federale de Lausanne

Citation: A program that captions your photos (2015, November 16) retrieved 17 July 2024 from https://phys.org/news/2015-11-captions-photos.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Microsoft Research project can interpret, caption photos

40 shares

Feedback to editors

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

9 hours ago

Intensive farming could raise risk of new pandemics, researchers warn

10 hours ago

Scientists develop new AI method to create material 'fingerprints'

12 hours ago

Study shows frogs can quickly increase their tolerance to pesticides

13 hours ago

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

13 hours ago

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

14 hours ago

Scientists use machine learning to predict diversity of tree species in forests

15 hours ago

Physicists pool skills to better describe the unstable sigma meson particle

16 hours ago

Telescope tag-team discovers 10 strange and exotic pulsars

16 hours ago

NASA transmits hip-hop song to deep space for first time

16 hours ago

Load comments (1)

A program that captions your photos

Getting it right

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

Intensive farming could raise risk of new pandemics, researchers warn

Scientists develop new AI method to create material 'fingerprints'

Study shows frogs can quickly increase their tolerance to pesticides

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

Scientists use machine learning to predict diversity of tree species in forests

Physicists pool skills to better describe the unstable sigma meson particle

Telescope tag-team discovers 10 strange and exotic pulsars

NASA transmits hip-hop song to deep space for first time

Relevant PhysicsForums posts

Particle.js: Exploring Particle Physics with Web Technologies

Help solving a geometrical matching issue with Graph Neural Networks

5 GHz PC WiFi connection Cybersecurity question

Help with some optimization code for Block Matrices

Is an API Always Necessary for Server-Client Communication?

I did this POST message configuration damage to my wifi internet, help

Microsoft Research project can interpret, caption photos

Making sense of funny bone from cartoon caption contest results

Neural algorithm gives photo masterpiece-style treatments

Revealing the mysteries of the Maya script

Image descriptions from computers show gains

YouTube extends automatic video captioning

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

A program that captions your photos

Getting it right

New 3D anatomical atlas of the African clawed frog increases understanding of development and metamorphosis processes

Intensive farming could raise risk of new pandemics, researchers warn

Scientists develop new AI method to create material 'fingerprints'

Study shows frogs can quickly increase their tolerance to pesticides

Nature-based solutions to disaster risk from climate change are cost-effective, study confirms

Astronomers discover what may be 21 neutron stars orbiting sun-like stars

Scientists use machine learning to predict diversity of tree species in forests

Physicists pool skills to better describe the unstable sigma meson particle

Telescope tag-team discovers 10 strange and exotic pulsars

NASA transmits hip-hop song to deep space for first time

Relevant PhysicsForums posts

Related Stories

Microsoft Research project can interpret, caption photos

Making sense of funny bone from cartoon caption contest results

Neural algorithm gives photo masterpiece-style treatments

Revealing the mysteries of the Maya script

Image descriptions from computers show gains

YouTube extends automatic video captioning

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience