New speech recognition model: Hidden Conditional Neural Fields

September 25, 2013, Toyohashi University of Technology
Model architectures of HMM, HCFR and HCNF.

Toyohashi Tech researchers propose the Hidden Conditional Neural Fields (HCNF) model for continuous speech recognition. The model is a combination of the Hidden Conditional Random Fields (HCRF) and a Multi-Layer Perceptron (MLP), that is, an extension of Hidden Markov Model (HMM).

This new model has the discriminative property for sequences from HCRF and the ability to extract non-linear features from an MLP. Furthermore, the HCNF can incorporate many types of features from non-linear features can be extracted, and is trained by sequential criteria.

In this paper, the researchers describe the formulation of HCNF and examine three methods to further improve using HCNF, which was an objective function that explicitly considered training errors, provided a hierarchical tandem-style feature, and included a deep non-linear feature extractor for the observation function.

HCRF can use a deep feed forward (DNN) in the observation function, and therefore, a sophisticated pre-training algorithm such as the deep belief network (DBN) can be used to provide a deep observation function.

The research shows that HCNF can be trained realistically without any initial model and outperform the HCRF and triphone hidden Markov model trained by the minimum phone error (MPE) manner using for continuous English phoneme recognition on the TIMIT core test and Japanese phoneme recognition on the IPA 100 test set.

Explore further: Speech recognition leaps forward

More information: Fujii, Y., Yamamoto, K. and Nakagawa, S. Hidden Conditional Fields for Continuous Phoneme Speech Recognition, IEICE Trans. Inf.&Sys., E95-D,2094-2104 (2012). DOI: 10.1587/transinf.E95.D.2094

Related Stories

Speech recognition leaps forward

August 29, 2011

During Interspeech 2011, the 12th annual Conference of the International Speech Communication Association being held in Florence, Italy, from Aug. 28 to 31, researchers from Microsoft Research will present work that dramatically ...

Research aims to improve speech recognition software

August 11, 2010

Anyone who has used an automated airline reservation system has experienced the promise - and the frustration - inherent in today's automatic speech recognition technology. When it works, the computer "understands" that you ...

Bionic speech recognition

September 9, 2010

As speech recognition systems become more commonplace - on the computer desktop top, at the call centre and even in the car - it is increasingly important to ensure that the voice signal is as clear as possible before it ...

Smart listeners and smooth talkers

November 17, 2011

Human-like performance in speech technology could be just around the corner, thanks to a new research project that links three UK universities.

Recommended for you

Google Assistant adds more languages in global push

February 23, 2018

Google said Friday its digital assistant software would be available in more than 30 languages by the end of the years as it steps up its artificial intelligence efforts against Amazon and others.

Researchers find tweeting in cities lower than expected

February 20, 2018

Studying data from Twitter, University of Illinois researchers found that less people tweet per capita from larger cities than in smaller ones, indicating an unexpected trend that has implications in understanding urban pace ...

Augmented reality takes 3-D printing to next level

February 20, 2018

Cornell researchers are taking 3-D printing and 3-D modeling to a new level by using augmented reality (AR) to allow designers to design in physical space while a robotic arm rapidly prints the work.

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.