New metamaterial device solves the cocktail party problem

August 11, 2015 by Bob Yirka report
(A) Measurement performed in an anechoic chamber. (Left) Photo of the metamaterial listener in the chamber. (Right) Schematic of the setup and two examples of synthesized word. (B) Measured transfer functions for the location of three speakers. Credit: Yangbo Xie, PNAS, doi: 10.1073/pnas.1502276112

(Phys.org)—A team of researchers at Duke University has found a way to solve what is known as the cocktail party problem, getting a computer to pick out different human voices among multiple speakers in a single room. In their paper, published in Proceedings of the National Academy of Sciences they describe the device they constructed and the algorithm that goes along with it.

Most people have the uncanny ability to stand among a group of people, many of whom are talking, and pick out the words that are being spoken by any given individual, at will—our brains are somehow able to combine all the necessary ingredients—pitch, tone, distance, etc. and perhaps most importantly, filtering, to allow us to process only the words being spoken by the person we are focusing our attention on. Getting a computer to accomplish the same feat has been difficult—most solutions rely on the placement of multiple microphones, though some newer approaches have relied on . Unfortunately, most such efforts have not led to a computer being anywhere near as accurate as a human being, until now.

The device developed by the team at Duke is made of plastic and is approximately pizza sized and shaped, thought it is a bit thicker—it was also constructed using a 3D printer. It is made up of 36 pie slices, or wedges, each made of a honeycombed structured acoustic metamaterial. Openings around the edges channel the toward a microphone that is fixed in the center of the hub. The wedges cause sound that passes through to be modified slightly in a beneficial way (attenuating certain frequencies). The sound that is captured by the microphone is then processed by an algorithm running on a computer that is able to localize what has been heard and assign words to a given speaker.

This prototype sensor can separate simultaneous sounds coming from different directions using a unique distortion given by the slice of "pie" that it passes through. Credit: Steve Cummer, Duke University

In testing their system, which the team describes as combining acoustic metamaterials and compressive sensing, they found it to be 96.7 percent accurate when run with three overlapping sound sources. They believe their device could be used in speech recognition applications and perhaps sensing or acoustic scenarios as well—and with some modifications, even in hearing aids.

The prototype sensor is tested in a sound-dampening room to eliminate echoes and unwanted background noise. Credit: Steve Cummer, Duke University

Explore further: Soda can array revisited: It may not beat the diffraction limit after all

More information: Single-sensor multispeaker listening with acoustic metamaterials, Yangbo Xie, PNAS, DOI: 10.1073/pnas.1502276112

Abstract
Designing a "cocktail party listener" that functionally mimics the selective perception of a human auditory system has been pursued over the past decades. By exploiting acoustic metamaterials and compressive sensing, we present here a single-sensor listening device that separates simultaneous overlapping sounds from different sources. The device with a compact array of resonant metamaterials is demonstrated to distinguish three overlapping and independent sources with 96.67% correct audio recognition. Segregation of the audio signals is achieved using physical layer encoding without relying on source characteristics. This hardware approach to multichannel source separation can be applied to robust speech recognition and hearing aids and may be extended to other acoustic imaging and sensing applications.

Press release

Related Stories

Phone snooping via gyroscope to be detailed at Usenix

August 15, 2014

Put aside fears of phone microphones and cameras doing eavesdropping mischief for a moment, because there is another sensor that has been flagged. Researchers from Stanford and defense research group at Rafael will present ...

Computer student on gesture control: Start experimenting

March 25, 2015

Back in 2012, authors from Microsoft Research and UbiComp Lab at University of Washington prepared their paper, "SoundWave: Using the Doppler Effect to Sense Gestures," for the Proceedings of the Association for Computing ...

Recommended for you

Two teams independently test Tomonaga–Luttinger theory

October 20, 2017

(Phys.org)—Two teams of researchers working independently of one another have found ways to test aspects of the Tomonaga–Luttinger theory that describes interacting quantum particles in 1-D ensembles in a Tomonaga–Luttinger ...

Using optical chaos to control the momentum of light

October 19, 2017

Integrated photonic circuits, which rely on light rather than electrons to move information, promise to revolutionize communications, sensing and data processing. But controlling and moving light poses serious challenges. ...

Black butterfly wings offer a model for better solar cells

October 19, 2017

(Phys.org)—A team of researchers with California Institute of Technology and the Karlsruh Institute of Technology has improved the efficiency of thin film solar cells by mimicking the architecture of rose butterfly wings. ...

Terahertz spectroscopy goes nano

October 19, 2017

Brown University researchers have demonstrated a way to bring a powerful form of spectroscopy—a technique used to study a wide variety of materials—into the nano-world.

3 comments

Adjust slider to filter visible comments by rank

Display comments: newest first

ichisan
not rated yet Aug 11, 2015
Sorry, I don't see how this solves the cocktail party problem. It's essentially a directional microphone. Every couple of months or so, someone or some group claims to have solved the cocktail party problem. How many times must this problem be solved before it is considered fully solved?
docile
Aug 12, 2015
This comment has been removed by a moderator.
EnricM
not rated yet Aug 12, 2015
the cocktail party problem, getting a computer to pick out different human voices among multiple speakers in a single room.


Oh, I thought they found a way to get free infinite booze...

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.