A new framework being created by a PhD student and academics from the University of Lincoln, UK, will enable people to search for videos on the internet, using images rather than text.
Typing in text to find a film clip on YouTube often results in diverse (and sometimes unrelated) videos being suggested.
This problem could soon be resolved with the advent of smarter video-search engines that are able to pick and choose the most relevant videos by analysing a tiny fraction of video frames.
Research to create a quick and easy framework that is able to discover semantic similarities between videos without using text-based tags is being carried out at the University of Lincoln, UK.
The volume of video data is rapidly increasing with more than 4 billion hours of video being watched on YouTube each month. The majority of available video data exists in compressed format and the first step towards effective video retrieval is to extract features from these compressed videos.
School of Computer Science PhD student Saddam Bekhet, along with Dr Amr Ahmed and Professor Andrew Hunter, has produced a paper on recent work which suggests a framework towards real-time video matching.
Saddam said: "Everyone uses search engines but currently you are only able to search by text even to search for a video clip, thus some results are far removed from what you were looking for. With the huge volume of data, a smarter video analyser is required to associate semantic tags to the uploaded videos, allowing more efficient indexing and search (including the contents of the video). Being able to enhance the underlying search mechanism (or even input a visual query) would really enhance the likes of YouTube."
Saddam's framework relies upon finding similarities between videos using tiny frames instead of using the full-size video frames. Such tiny frames are easily extracted from a compressed video in real-time and able to fully represent video content, without wasting more time in decompressing the video to perform complex computer algorithms.
"I want to discover the semantic similarity between videos using the content only," he explained. "I adapted some new techniques and found that tiny representative frames could be used to discover similarities. The next stage is to build an effective framework."
The research follows on from work carried out to provide a framework for automated video analysis and annotation, by Lincoln's Digital Contents Analysis, Production and Interaction (DCAPI) Group.
The paper 'Video Matching Using DC-image and Local Features' was awarded the Best Student Paper Award at the International Conference of Signal and Image Engineering - part of the 2013 World Congress on Engineering. Organised by the International Association of Engineers (IAENG), the conference focuses on the frontier topics in the theoretical and applied engineering and computer science subjects.
Explore further: TouchCast introduces its interactive video iPad app