Date: April 1, 2005 Time: 3:00 pm Location: GCATT Room 325 Speaker(s): Dr. Juergen Schroeter, AT&T Labs - Research
Title: Some Scalability Issues in Speech Technologies
Abstract:
Most researchers are pleased when their novel algorithms outperform previously available technology. For practical use, however, successful application of our research also needs to scale, that is, be (almost) fully automated in day-to-day operations for existing applications while also being able to seamlessly accommodate more and bigger applications. This is particularly important for speech technology where too few speech experts are available to do all the handy-work that still needs to be done for every new application.
One example where thinking of scaling issues early on helped us is in Text-to-Speech (TTS). TTS is the technology that allows machines to talk to humans, delivering information through synthetic speech. Besides improving the pure algorithmic aspects of the technology, customer demand towards application-specific synthetic voices drove us to a more than 100 time speed-up of creating top-quality TTS voices.
Another example is speech data mining that is aimed at monitoring and evaluating existing human-machine dialog systems. The traditional monitoring approach involves speech technology experts listening to a sample of human machine dialogs. The better approach, however, is to mine the detailed application logs for all the dialogs thus providing a complete unbiased, quantitative analysis of the human-machine interactions. Solving scalability issues led to advanced tools for visualizing, evaluating, and designing complex dialog systems across a wide range of applications.
Bio:
Dipl.-Ing. (EE), 1976, and Dr.-Ing. (EE), 1983, Ruhr-Universitaet Bochum, W. Germany; AT&T Bell Laboratories, 1986-1995, AT&T Labs - Research, 1996-present.
From 1976 to 1985, Dr. Schroeter was with the Institute for Communication Acoustics, Ruhr-University Bochum, Germany, where he did research in the areas of hearing and acoustic signal processing. At AT&T Bell Laboratories, he has been working on speech coding and synthesis methods employing models of the vocal tract and vocal cords. As a Director in AT&T Labs - Research, he is leading efforts in Speech Algorithms and Engines Research. In 2001, his team created AT&T Natural Voices' award-winning text-to-speech synthesis system.
Dr. Schroeter is a Fellow of IEEE and a Fellow of the Acoustical Society of America. In 2001, he received the AT&T Science and Technology Medal. He served as an Associate Editor for the IEEE Transactions on Speech and Audio Processing and for the Journal of the Acoustical Society of America.