SLS
Spoken Language Systems
MIT Computer Science and Artificial Intelligence Laboratory

SLS RESEARCH

Spoken Lecture Processing

In the past decade, we have seen a dramatic increase in the availability of on-line academic lecture material: low-cost media and fast networks allow new and exciting ways for disseminating knowledge in a variety of media ranging from audio recordings to streaming video. These educational resources can potentially change the way people learn; students with disabilities can enhance their educational experience, professionals can keep up with recent advancements in their field, and people of all ages can satisfy their thirst for knowledge. It is conspicuous, however, that in contrast to many other communicative activities, lecture processing has until now enjoyed relatively little benefit from the development of human language technology.

Recorded lectures could be more widely and effectively disseminated if material could be automatically indexed to allow students to access selected portions of the material via web-browsers and text-based queries (e.g., "tell me about A* search"). However, existing technology is severely limited when it comes to processing lectures. Automatic speech recognition of lecture materials is often fraught with high word error rates due to specialized technical vocabulary and the lack of in-domain spoken data for training. Although speech information retrieval algorithms can operate acceptably with recognition errors, they mostly only support key-word searches. The ability to accurately capture structural information required for concept-based retrieval is beyond the reach of existing techniques for speech analysis. Thus, the goal of this research is to enable fast, accurate and easy access to lecture content by developing speech technology for spoken lecture transcription, tagging and retrieval, and ultimately automatic structure induction and summarization.

You can watch the video below to get an overview of the project, or listen to a podcast of a Spark episode that aired on CBC radio.

Further Reading

J. Glass, T. Hazen, S. Cyphers, I. Malioutov, D. Huynh, and R. Barzilay. "Recent Progress in the MIT Spoken Lecture Processing Project," Proc. Interspeech, Antwerp, 2007. (PDF)

J. Glass, T. Hazen, L. Hetherington and C. Wang, "Analysis and processing of lecture audio data: Preliminary investigations", In Proceedings of the HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, pp. 9-12, Boston, Massachusetts, May, 2004. (PDF)



32 Vassar Street
Cambridge, MA 02139 USA
(+1) 617.253.3049
 


©2020, Spoken Language Systems Group. All rights reserved.

About SLS
---Our Technologies
---Demonstration
Research Initiatives
---Technologies
---Applications
---Glossary
Publications
---Research Summary
---Theses
---Papers
---Archives
News and Events
---News Articles
---Archives
SLS People
---Research Staff
---Post-Doctoral Students
---Administrative Staff
---Support Staff
---Visitors
---Graduate Students
---Undergraduate Students
---Emeritus
---Positions with SLS
Contact Us
---Positions with SLS
---Visitor Information