Date: September 23, 2005
Time: 3:00 p.m.
Location: GCATT Room 325
Speaker(s): Brett Matthews
Title: Synthesizing Breathiness in Speech with Sinusoidal Modeling
Abstract:
Voice conversion or morphing is the classic academic problem of transforming speech uttered by one speaker (the "Source Speaker") so that it sounds as if were spoken by another (the "Target Speaker"). Approaches to voice morphing typically conclude that pitch and formant transformation are simply not enough; prosody, accent, speaking rate and other perceptual and suprasegmental characteristics must also be taken into account.

In this talk, we present recent work in synthesizing a breathy quality in speech, especially in the context of improving the quality of voice morphing. We use sinusoidal modeling, along with classic techniques from analog communication systems (variants of AM and PM), to de-voice regions of the spectrum and synthesize breathiness in speech. We also discuss in detail a new Java-based sinusoidal modelling re-synthesis framework developed over the summer.

This work, performed in conjunction with Ellen Eide and Raimo Bakis, was done during an internship in the Text-to-Speech Synthesis Group at the IBM TJ Watson Research Center in Yorktown Heights, NY.

Bio:
Brett Matthews was born in Brooklyn, NY, in 1978. He received the B.S. in Computer & Systems Engineering from Rensselaer Polytechnic Institute in 2001 and the M.S. in Electrical & Computer Engineering from Georgia Tech in 2003.

He is currently pursuing the Ph.D. in Electrical & Computer Engineering at Georgia Tech with a focus in speech & signal processing. His research interests include Text-to-Speech Synthesis, Voice Conversion, and Acoustic/Phonetic Automatic Speech Recognition.