|
SLS RESEARCH
Spoken Lecture Processing
In the past decade, we have seen a dramatic increase in the
availability of on-line academic lecture material: low-cost media and
fast networks allow new and exciting ways for disseminating knowledge
in a variety of media ranging from audio recordings to streaming
video. These educational resources can potentially change the way
people learn; students with disabilities can enhance their educational
experience, professionals can keep up with recent advancements in
their field, and people of all ages can satisfy their thirst for
knowledge. It is conspicuous, however, that in contrast to many other
communicative activities, lecture processing has until now enjoyed
relatively little benefit from the development of human language
technology.
Recorded lectures could be more widely and effectively disseminated if
material could be automatically indexed to allow students to access
selected portions of the material via web-browsers and text-based
queries (e.g., "tell me about A* search"). However, existing
technology is severely limited when it comes to processing lectures.
Automatic speech recognition of lecture materials is often fraught
with high word error rates due to specialized technical vocabulary and
the lack of in-domain spoken data for training. Although speech
information retrieval algorithms can operate acceptably with
recognition errors, they mostly only support key-word searches. The
ability to accurately capture structural information required for
concept-based retrieval is beyond the reach of existing techniques for
speech analysis. Thus, the goal of this research is to enable fast,
accurate and easy access to lecture content by developing speech
technology for spoken lecture transcription, tagging and retrieval,
and ultimately automatic structure induction and summarization.
You can watch the video
below to get an overview of the project, or listen to a podcast
of a Spark episode that aired on CBC radio.
Further Reading
J. Glass, T. Hazen, S. Cyphers, I. Malioutov, D. Huynh, and
R. Barzilay. "Recent Progress in the MIT Spoken Lecture Processing
Project," Proc. Interspeech, Antwerp, 2007. (PDF)
J. Glass, T. Hazen, L. Hetherington and C. Wang, "Analysis and
processing of lecture audio data: Preliminary investigations", In
Proceedings of the HLT-NAACL 2004 Workshop on Interdisciplinary
Approaches to Speech Indexing and Retrieval, pp. 9-12, Boston,
Massachusetts, May, 2004. (PDF)
|