Date: October 27, 2006
Time: 3:00 p.m.
Location: Centergy One 5186
Speaker(s): Yu Tsao
Title: A Vector Space Approach to Environment Modeling for Robust Speech Recognition
Abstract:
A vector space approach to characterize environments for robust speech recognition is proposed. We represent a given environment by a super-vector formed by concatenating all the mean vectors of the Gaussian mixture components of the state observation densities of all hidden Markov models trained in the particular environment. New environment super-vectors can now be obtained either by an interpolation method with a collection of super-vectors trained from many real or simulated environments or by a transformation performed on an anchor super-vector for a specific environment, such as a clean condition. At a 5dB signal-to-noise (SNR) level, both interpolation- and transformation-based approaches achieve a significant error rate reduction of close to 47% from a baseline system with cepstral mean subtraction (CMS) with only two adaptation utterances. When incorporating N-best information to perform unsupervised adaptation at 5dB SNR with the same two utterances, we achieve a relative error reduction of about 40%, close to that achieved in the supervised mode.
Bio:
Yu Tsao received the B.S. and M.S. degrees in electrical engineering from National Taiwan University (NTU), Taipei, Taiwan, R.O.C., in 1999 and 2001, respectively. He is currently pursuing the Ph.D. degree at the Center for Signal and Image Processing (CSIP), Georgia Institute of Technology. His current research is primarily focused on the detection-based speech recognition.