Advancing AI for video: Startup launches powerful video processing platform
Voxel51, a University of Michigan startup, today launched its flagship product—a software platform designed to make it easier, faster and more affordable to access the untapped potential of video data.
The software is aimed at companies that work with video but struggle to extract the information they need from it. While video is a rich form of data, it is difficult to analyze and search because of its complexity, large file sizes and lack of defined units like words.
Voxel51 has set out to overcome those obstacles with their video analytics platform and open source software libraries that, together, enable state-of-the-art video recognition. It identifies and follows objects and actions in each clip. As co-founder Brian Moore says, "We transform video into value."
Their initial focus, which is particularly relevant to driverless cars, is on video footage from road scenes and for public safety. In both of these applications, cameras are key sensors, but it is time-consuming for humans to process the data so that a computer can analyze it. Faster, automated processing should speed the development of better computer vision.
"This is the first and only publicly available platform for road scene understanding," said co-founder Jason Corso, a professor of electrical and computer engineering. "The auto companies are building them, but in proprietary silos. Ours will be available for anyone to use and try.
"By democratizing video processing and access to large, annotated libraries, we enable younger startups to compete with the well-resourced teams working on driverless cars and other computer vision applications in large companies."
In driverless vehicles today, perception algorithms are produced with machine learning techniques, which means that they need to be trained on video clips that are annotated with object identification and tracking—for instance, pedestrians, vehicles, lamp posts, signs and traffic lights.
Before the systems are trained, the video must be annotated—usually by a human. That's why it is time-consuming and expensive to create training data for machine-learning algorithms.
With Voxel51, users can rely on the platform's AI software to speed up much of this process. Then, it's possible to search for very specific video content—for example, a dog walker. And with the open source library, some users have access to much larger datasets than they could otherwise afford to acquire.
"There's a clear need for the kind of cutting-edge AI technology that's been developed by Voxel51 in the automated vehicle space, and we're enthusiastic about the progress that's already being made," said Bryce Pilz, director of licensing at U-M Tech Transfer.
"Right now, Voxel51 technology is helping autonomous vehicles at Mcity make sense of what they are seeing on the road so that they can make better decisions, and we have no doubt that we'll eventually see these innovations making their way into production vehicles, making them safer, more efficient and reliable."
Beyond putting powerful AI video analysis into the hands of developers, Voxel51's main under-the-hood differentiator is that their processing operates in the space-time volume across frames, where they can capture motion and appearance changes over time. In other words, they're looking not at pixels but at voxels.
The company has raised $2 million in venture capital. It's located in Ann Arbor, employs 15 people and is hiring many more, Corso says.
"Since the dawn of modern computing, humans have been adapting to computers. I think it's about time computers started adapting to us, and that involves a deeper understanding of the visual world," Corso said. "Voxel51's new platform is an important step in that direction. We want to enable new companies to add visual perception capabilities with ease and power where they otherwise would not have been able."