Spoken Language Systems
MIT Computer Science and Artificial Intelligence Laboratory


Speech Synthesis

ENVOICE is a concatenative speech synthesis system developed by the SLS group. ENVOICE creates synthetic speech by concatentating segments of speech from a pre-recorded speech corpus. The concatentation of segments can occur at the phrase, word, or sub-word levels. The choice of segments to concatenate is determined by searching the pre-recorded corpus to find the segments best matching a set of concatenation constraints. ENVOICE takes advantage of finite-state transducer (FST) technology to efficiently deliver synthetic speech on-demand.


J. Yi, Natural-Sounding Speech Synthesis Using Variable-Length Units M.Eng. thesis, MIT Department of Electrical Engineering and Computer Science, May 1998. (PDF)

J. Yi and J. Glass, "Natural-Sounding Speech Synthesis Using Variable-Length Units," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

J. Yi, J. Glass and L. Hetherington, "A Flexible, Scalable Finite-State Transducer Architecture for Corpus-Based Concatenative Speech Synthesis," Proc. ICSLP, Beijing, China, October 2000. (PDF)

