SLS
Spoken Language Systems
MIT Computer Science and Artificial Intelligence Laboratory

SLS PUBLICATIONS

Papers (1998 - present)

2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | 1998

Many of our papers are available below in Adobe Acrobat (PDF) format and possibly gzip'd PostScript format.

2014

I. Saleh, S. Joty, L. Màrquez, A. Moschitti, P. Nakov, S. Cyphers, and J. Glass, "A Study of Using Syntactic and Semantic Structures for Concept Segmentation and Labeling," Proc. COLING, pp. 193-202, Dublin, Ireland, August 2014. (PDF)

T. Al Hanai and J. Glass, "Lexical Modeling for Arabic ASR: A Systematic Approach," Proc. Interspeech, pp. 2605-2609, Singapore, September 2014. (PDF)

D. Harwath and J. Glass, "Speech Recognition without a Lexicon - Bridging the Gap between Graphemic and Phonetic Systems," Proc. Interspeech, pp. 2655-2659, Singapore, September 2014. (PDF)

D. Harwath, A. Gruenstein, and I. McGraw, "Choosing Useful Word Alternates for Automatic Speech Recognition Correction Interfaces," Proc. Interspeech, pp. 949-953, Singapore, September 2014. (PDF)

A. Lee and J. Glass, "Context-dependent Pronunciation Error Pattern Discovery with Limited Annotations," Proc. Interspeech, pp. 2877-2881, Singapore, September 2014. (PDF)

H. Lee, Y. Zhang, E. Chuangsuwanich, and J. Glass, "Graph-based Re-ranking using Acoustic Feature Similarity between Search Results for Spoken Term Detection on Low-resource Languages," Proc. Interspeech, pp. 2479-2483, Singapore, September 2014. (PDF)

S. Shum, N. Dehak, and J. Glass, "Limited Labels for Unlimited Data: Active Learning for Speaker Recognition," Proc. Interspeech, pp. 383-387, Singapore, September 2014. (PDF)

Y. Zhang, E. Chuangsuwanich, and J. Glass, "Language ID-based Training of Multilingual Stacked Bottleneck Features," Proc. Interspeech, pp. 1-5, Singapore, September 2014. (PDF)

P. Cardinal, A. Ali, N. Dehak, Y. Zhang, T. Al Hanai, Y. Zhang, J. Glass, and S. Vogel, "Recent Advances in ASR Applied to an Arabic Transcription System for Al-Jazeera," Proc. Interspeech, pp. 2088-2092, Singapore, September 2014. (PDF)

B. Lake, C. Lee, J. Glass, and J. Tenenbaum, "One-Shot Learning of Generative Speech Concepts," Proc. CogSci, pp. 803-808. Quebec City, July 2014. (PDF)

M. H. Bahari, N. Dehak, H. Van hamme, L. Burget, A. M. Ali, and J. Glass, "Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition," IEEE/ACM Transactions on Audio, Speech, and Language Processing, July 2014, Vol. 22, No. 7, pp. 1117-1129. (PDF)

S. Shum, D. A. Reynolds, D. Garcia-Romero, and A. McCree, "Unsupervised Clustering Approaches for Domain Adaptation in Speaker Recognition Systems," Proc. Odyssey, pp. 265-272, Joensuu, Finland, June 2014. (PDF) (best student paper award)

N. Dehak, O. Plchot, M. H. Bahari, L. Burget, H. Van hamme, and R. Dehak, "GMM Weights Adaptation Based on Subspace Approaches for Speaker Verification," Proc. Odyssey, pp. 48-53, Joensuu, Finland, June 2014. (PDF)

D. Garcia-Romero, A. McCree, S. Shum, N. Brümmer, and C. Vaquero, "Unsupervised Domain Adaptation for I-Vector Speaker Recognition," Proc. Odyssey, pp. 260-264, Joensuu, Finland, June 2014. (PDF)

X. Feng, Y. Zhang, J. Glass, "Speech Feature Denoising and Dereverberation via Deep Autoencoders for Noisy Reverberant Speech Recognition," Proc. ICASSP, pp. 1778-1782, Florence, Italy, May 2014. (PDF)

Y. Zhang, E. Chuangsuwanich, and J. Glass, "Extracting Deep Neural Network Bottleneck Features Using Low-Rank Matrix Factorization," Proc. ICASSP, pp. 185-189, Florence, Italy, May 2014. (PDF)

X. Feng, K. Kumatani, J. McDonough, "The CMU-MIT Reverb Challenge 2014 System: Description and Results," Proc. REVERB Workshop, pp. 1-7, Florence, Italy, May 2014. (PDF)

C. Cai, P. Guo, J. Glass, and R. Miller, "Wait-Learning: Leveraging Conversational Dead Time for Second Language Education," Proc. CHI, Toronto, Canada, April 2014. (PDF)

M. Price, J. Glass, and A. Chandrakasan, "A 6mW 5K-Word Real-Time Speech Recognizer Using WFST Models," Proc. ISSCC, pp. 454-455, San Francisco, California, USA, February 2014. (PDF)

2013

J. Liu, P. Pasupat, Y, Wang, S. Cyphers, and J. Glass, "Query Understanding Enhanced by Hierarchical Parsing Structures," Proc. ASRU, pp.72-77, Olomouc, Czech Republic, December 2013. (PDF)

C. Lee, Y. Zhang, and J. Glass, "Joint Learning of Phonetic Units and Word Pronunciations for ASR," Proc. EMNLP, pp. 182-192, Seattle, Washington, USA, October 2013. (PDF)

C. Cai, R. Miller, and S. Seneff, "Enhancing Speech Recognition in Fast-Paced Educational Games using Contextual Cues," Proc. SLaTE, pp. 54-59, Grenoble, France, August 2013. (PDF)

A. Lee and J. Glass, "Pronunciation Assessment via a Comparison-based System," Proc. SLaTE, pp. 122-126, Grenoble, France, August 2013. (PDF)

M. Senoussaoui, P. Kenny, P. Dumouchel, and N. Dehak, "New Cosine Similarity Scorings to Implement Gender-independent Speaker Verification," Proc. Interspeech, pp. 2773-2777, Lyon, France, August 2013. (PDF)

X. Fang, N. Dehak, and J. Glass, "Bayesian Distance Metric Learning on i-vector for Speaker Verification," Proc. Interspeech, pp. 2514-2518, Lyon, France, August 2013. (PDF)

W. Li, J. Glass, N. Roy, and S. Teller, "Probabilistic Dialogue Modeling for Speech-Enabled Assistive Technology," Proc. SLPAT, pp. 67-72, Grenoble, France, August 2013. (PDF)

S. Shum, N. Dehak, R. Dehak, and J. Glass, "Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach," IEEE Transactions on Audio, Speech, and Language Processing, October 2013, Vol. 21, No. 10., pp. 2015-2028. (PDF)

E. Hill, D. Han, P. Dumouchel, N. Dehak, T. Quatieri, C. Moehs, M. Oscar-Berman, J. Giordano, T. Simpatico, and K. Blum, "Long Term SuboxoneTM Emotional Reactivity As Measured by Automatic Detection in Speech," PLOS ONE, July 2013, Volume 8, Issue 7, pp. 1-14. (PDF)

M. H. Bahari, N. Dehak, and H. Van hamme, "Gaussian Mixture Model Weight Supervector Decomposition and Adaptation," Technical Report, June 12, 2013. (PDF)

A. Jansen, E. Dupoux, S. Goldwater, M. Johnson, S. Khudanpur, K. Church, N. Feldman, H. Hermansky, F. Metze, R. Rose, M. Seltzer, P. Clark, I. McGraw, B. Varadarajan, E. Bennett, B. Borschinger, J. Chiu, E. Dunbar, A. Fourtassi, D. Harwath, C. Lee, K. Levin, A. Norouzian, V. Peddinti, R. Richardson, T. Schatz, and S. Thomas, "A Summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition," Proc. ICASSP, pp. 8111-8115, Vancouver, Canada, May 2013. (PDF)

O. Plchot, S. Matsoukas, P. Matĕjka, N. Dehak, J. Ma, S. Cumani, O. Glembek, H. Hermansky, S. H. Mallidi, N. Mesgarani, R. Schwartz, M. Soufifar, Z. H. Tan, S. Thomas, B. Zhang, and X. Zhou, "Developing a Speaker Identification System for the DARPA RATS Project," Proc. ICASSP, pp. 6768-6772, Vancouver, Canada, May 2013. (PDF)

D. Harwath, T. J. Hazen, and J. Glass, "Zero Resource Spoken Audio Corpus Analysis," Proc. ICASSP, pp. 8555-8559, Vancouver, Canada, May 2013. (PDF)

A. Lee, Y. Zhang, and J. Glass, "Mispronunciation Detection via Dynamic Time Warping on Deep Belief Network-Based Posteriorgrams," Proc. ICASSP, pp. 8227-8231, Vancouver, Canada, May 2013. (PDF) (student paper award)

J. Liu, P. Pasupat, S. Cyphers, and J. Glass, "Asgard: A Portable Architecture for Multilingual Dialogue Systems," Proc. ICASSP, pp. 8386-8390, Vancouver, Canada, May 2013. (PDF)

S. Shum, W. Campbell, and D. Reynolds, "Large-Scale Community Detection on Speaker Content Graphs," Proc. ICASSP, pp. 7716-7720, Vancouver, Canada, May 2013. (PDF)

A. Samsel and S. Seneff, "Glyphosate's Suppression of Cytochrome P450 Enzymes and Amino Acid Biosynthesis by the Gut Microbiome: Pathways to Modern Diseases," Entropy 2013, 15, 1416-1463; doi:10.3390/e15041416. (PDF)

I. McGraw, I. Badr, and J. Glass, "Learning Lexicons From Speech Using a Pronunciation Mixture Model," IEEE Transactions on Audio, Speech, and Language Processing, February 2013, Volume 21, Issue 2, pp. 357-366. (PDF)

2012

J. Liu, S. Seneff, and V. Zue, "Harvesting and Summarizing User-Generated Content for Advanced Speech-Based HCI," IEEE Journal of Selected Topics in Signal Processing, Vol. 6, No. 8, pp. 982-992, December 2012. (PDF)

S. Hartzell and S. Seneff, "Impaired Sulfate Metabolism and Epigenetics: Is There a Link in Autism?," Entropy 2012, 14, 1953-1977. (PDF)

S. Seneff, J. Liu, and R. Davidson, "Empirical Data Confirm Autism Symptoms Related to Aluminum and Acetaminophen Exposure," Entropy 2012, 14, 2227-2253; doi:10.3390/e14112227 (PDF)

S. Seneff, R. M. Davidson, and J. Liu, "Is Cholesterol Sulfate Deficiency a Common Factor in Preeclampsia, Autism, and Pernicious Anemia?," Entropy 2012, 14, 2265-2290; doi:10.3390/e14112265. (PDF)

S. Seneff, A. Lauritzen, R. Davidson, and L. Lentz-Marino, "Is Endothelial Nitric Oxide Synthase a Moonlighting Protein Whose Day Job is Cholesterol Sulfate Synthesis? Implications for Cholesterol Transport, Diabetes and Cardiovascular Disease," Entropy 2012, 14, 2492-2530; doi:10.3390/e14122492. (PDF)

A. Lee and J. Glass, "A Comparison-based Approach to Mispronunciation Detection," Proc. Spoken Language Technologies Workshop, pp. 382-387, Miami, Florida, December 2012. (PDF)

A. Lee and J. Glass, "Sentence Detection Using Multiple Annotations," Proc. Interspeech, Portland, Oregon, September 2012. (PDF)

J. Liu, S. Cyphers, P. Pasupat, I. McGraw, and J. Glass, "A Conversational Movie Search System Based on Conditional Random Fields," Proc. Interspeech, Portland, Oregon, September 2012. (PDF)

I. McGraw, S. Cyphers, P. Pasupat, J. Liu, and J. Glass, "Automating Crowd-supervised Learning for Spoken Language Systems," Proc. Interspeech, Portland, Oregon, September 2012. (PDF)

I. McGraw and A. Gruenstein, "Estimating Word-Stability During Incremental Speech Recognition," Proc. Interspeech, Portland, Oregon, September 2012. (PDF)

S. Shum, N. Dehak, and J. Glass, "On the Use of Spectral and Iterative Methods for Speaker Diarization," Proc. Interspeech, Portland, Oregon, September 2012. (PDF)

P. Matĕjka, O. Plchot, M. Soufifar, O. Glembek, L. D'Haro, K. Veselý, F. Grézl, J. Ma, S. Matsoukas, and N. Dehak. "Patrol Team Language Identification System for DARPA RATS P1 Evaluation," Proc. Interspeech, Portland, Oregon, September 2012. (PDF)

J. Glass, "Towards Unsupervised Speech Processing," Keynote, Proc. ISSPA, Montreal, July 2012. (PDF)

C. Lee and J. Glass, "A Nonparametric Bayesian Approach to Acoustic Model Discovery," Proc. ACL, pp. 40-49, Jeju, Republic of Korea, July 2012. (PDF)

M. Senoussaoui, N. Dehak, P. Kenny, R. Dehak, and P. Dumouchel, "First Attempt of Boltzmann Machines for Speaker Verification," Proc. Odyssey, pp. 117-121, Singapore, June 2012. (PDF)

E. Singer, P. Torres-Carrasquillo, D. Reynolds, A. McCree, F. Richardson, N. Dehak, and D. Sturim, "The MITLL NIST LRE 2011 Language Recognition System," Proc. Odyssey, pp. 209-215, Singapore, June 2012. (PDF)

H. Chang and J. Glass, "Evaluation of Multi-level Context-Dependent Acoustic Model for Large Vocabulary Speaker Adaptation Tasks," Proc. ICASSP, pp. 4313-4316, Kyoto, Japan, March 2012. (PDF)

E. Chuangsuwanich, S. Watanabe, T, Hori, T. Iwata, and J. Glass, "Handling Uncertain Observations in Unsupervised Topic-Mixture Language Model Adaptation," Proc. ICASSP, pp. 5033-5036, Kyoto, Japan, March 2012. (PDF)

D. Harwath and T. J. Hazen, "Topic Identification Based Extrinsic Evaluation of Summarization Techniques Applied to Conversational Speech," Proc. ICASSP, pp. 5073-5076, Kyoto, Japan, March 2012. (PDF)

Y. Xu and S. Seneff, "Improving Nonnative Speech Understanding Using Context and N-Best Meaning Fusion," Proc. ICASSP, pp. 4977-4980, Kyoto, Japan, March 2012. (PDF)

Y. Zhang, K. Adl, and J. Glass, "Fast Spoken Query Detection Using Lower-Bound Dynamic Time Warping on Graphical Processing Units," Proc. ICASSP, pp. 5173-5176, Kyoto, Japan, March 2012. (PDF)

Y. Zhang, R. Salakhutdinov, H. Chang, and J. Glass, "Resource Configurable Spoken Query Detection Using Deep Boltzmann Machines," Proc. ICASSP, pp. 5161-5164, Kyoto, Japan, March 2012. (PDF)

2011

H. Chang and J. Glass, "Multi-level Context-dependent Acoustic Modeling for Automatic Speech Recognition," Proc. ASRU, Waikoloa, Hawaii, December 2011. (PDF)

J. Liu, "A Dialogue System for Accessing Drug Reviews," Proc. ASRU, Waikoloa, Hawaii, December 2011. (PDF)

T. Mertens and S. Seneff, "Subword-based Automatic Lexicon Learning for ASR," Proc. ASRU, Waikoloa, Hawaii, December 2011. (PDF)

J. Liu, A. Li, and S. Seneff, "Automatic Drug Side Effect Discovery from Online Patient-Submitted Reviews: Focus on Statin Drugs," Proc. IMMM, Barcelona, Spaine, October 2011. (PDF)

I. Badr, I. McGraw, and J. Glass, "Pronunciation Learning from Continuous Speech," Proc. Interspeech, pp. 549-552, Florence, Italy, August 2011. (PDF)

E. Chuangsuwanich and J. Glass, "Robust Voice Activity Detector for Real World Applications Using Harmonicity and Modulation Frequency," Proc. Interspeech, pp. 2645-2648, Florence, Italy, August 2011. (PDF)

N. Dehak, P. Torres-Carrasquillo, D. Reynolds, and R. Dehak, "Language Recognition via Ivectors and Dimensionality Reduction," Proc. Interspeech, pp. 857-860, Florence, Italy, August 2011. (PDF)

C. Lee and J. Glass, "A Transcription Task for Crowdsourcing with Automatic Quality Control," Proc. Interspeech, pp. 3041-3044, Florence, Italy, August 2011. (PDF)

C. Lee, J. Glass, and O. Ghitza, "An Efferent-Inspired Auditory Model Front-End for Speech Recognition," Proc. Interspeech, pp. 49-52, Florence, Italy, August 2011. (PDF)

I. McGraw, J. Glass, and S. Seneff, "Growing a Spoken Language Interface on Amazon Mechanical Turk," Proc. Interspeech, pp. 3057-3060, Florence, Italy, August 2011. (PDF)

S. Shum, N. Dehak, E. Chuangsuwanich, D. Reynolds, and J. Glass, "Exploiting Intra-Conversation Variability for Speaker Diarization," Proc. Interspeech, pp. 945-948, Florence, Italy, August 2011. (PDF)

Y. Zhang and J. Glass, "A Piecewise Aggregate Approximation Lower-Bound Estimate for Posteriorgram-based Dynamic Time Warping," Proc. Interspeech, pp. 1909-1912, Florence, Italy, August 2011. (PDF)

Y. Xu and S. Seneff, "A Generic Framework for Building Dialogue Games for Language Learning: Application in the Flight Domain," Proc. SLaTE, Venice, Italy, August 2011. (PDF)

Z. Karam, W. Campbell, N. Dehak, "Graph Relational Features for Speaker Recognition and Mining," Proc. Statistical Signal Processing Workshop (SSP 2011), pp. 525-528, Nice, France, June 2011. (PDF)

H. Chang, Y. Sung, B. Strope, F. Beaufays, "Recognizing English Queries in Mandarin Voice Search," Proc. ICASSP, pp. 5016-5019, Prague, Czech Republic, May 2011. (PDF)

N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front-End Factor Analysis for Speaker Verification," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, No. 4, May 2011, pp. 788-798. (PDF) (Selected for the IEEE Signal Processing Society Young Author Best Paper Award.)

N. Dehak, Z. Karam, D. Reynolds, R. Dehak, W. Campbell, and J. Glass, "A Channel-Blind System for Speaker Verification," Proc. ICASSP, pp. 4536-4539, Prague, Czech Republic, May 2011. (PDF)

Z. Karam, W. Campbell, and N. Dehak, "Towards Reduced False-Alarms Using Cohorts," Proc. ICASSP, pp. 4512-4515, Prague, Czech Republic, May 2011. (PDF)

J. Liu, X. Li, A. Acero, and Y. Wang, "Lexicon Modeling for Query Understanding," Proc. ICASSP, pp. 5604-5607, Prague, Czech Republic, May 2011. (PDF)

D. Sturim, W. Campbell, N. Dehak, Z. Karam, A. McCree, D. Reynolds, F. Richardson, P. Torres-Carrasquillo, and S. Shum, "The MIT LL 2010 Speaker Recognition Evaluation System: Scalable Language-Independent Speaker Recognition," Proc. ICASSP, pp. 5272-5275, Prague, Czech Republic, May 2011. (PDF)

Y. Zhang and J. Glass, "An Inner-Product Lower-Bound Estimate for Dynamic Time Warping," Proc. ICASSP, pp. 5660-5663, Prague, Czech Republic, May 2011. (PDF)

Y. Zhang, L. Deng, X. He, and A. Acero, "A Novel Decision Function and the Associated Decision-Feedback Learning for Speech Translation," Proc. ICASSP, pp. 5608-5611, Prague, Czech Republic, May 2011. (PDF)

2010

Y. Xu, S. Seneff, A. Li, J. Polifroni, "Semantic Understanding by Combining Extended CFG Parser with HMM Model," Proc. Spoken Language Technologies Workshop, Berkeley, California, USA, December 2010. (PDF)

S. Liu, S. Seneff, J. Glass, "A Collective Data Generation Method for Speech Language Models," Proc. Spoken Language Technologies Workshop, Berkeley, California, USA, December 2010. (PDF)

E. Chuangsuwanich, S. Cyphers, J. Glass, and S. Teller, "Spoken Command of Large Mobile Robots in Outdoor Environments," Proc. Spoken Language Technologies Workshop, Berkeley, California, USA, December 2010. (PDF)

J. Polifroni, S. Seneff, S.R.K. Branavan, C. Wang, and R. Barzilay, "Good Grief, I Can Speak It! Preliminary Experiments in Audio Restaurant Reviews," Proc. Spoken Language Technologies Workshop, Berkeley, California, USA, December 2010. (PDF)

I. Badr, I. McGraw, and J. Glass, "Learning New Word Pronunciations from Spoken Examples," Proc. Interspeech, Chiba, Japan, September 2010. (PDF)

J. Liu, S. Seneff, and V. Zue, "Utilizing Review Summarization in a Spoken Recommendation System," Proc. SIGDIAL, Tokyo, Japan, September 2010. (PDF)

Y. Xu and S. Seneff, "Dialogue Management Based on Entities and Constraints," Proc. SIGDIAL, Tokyo, Japan, September 2010. (PDF)

M. Peabody and S. Seneff, "A Simple Feature Normalization Scheme for Non-native Vowel Assessement," Satellite Workshop on Second Language Studies: Acquisition, Learning, Education and Technology, Tokyo, Japan, September 2010. (PDF)

J. Polifroni, I. Kiss, S. Seneff, "Speech for Content Creation," Proc. SiMPE, Lisbon, Portugal, September 2010. (PDF)

R. Zbib, S. Matsoukas, R. Schwartz, and J. Makhoul, "Decision Trees for Lexical Smoothing in Statistical Machine Translation," Proc. ACL Joint 5th Workshop on Statistical Machine Translation, Uppsala, Sweden, July 2010. (PDF)

N. Dehak, R. Dehak, J. Glass, D. Reynolds, and P. Kenny, "Cosine Similarity Scoring without Score Normalization Techniques," Proc. IEEE Odyssey Workshop, Brno, Czech Republic, June 2010. (PDF)

S. Shum, N. Dehak, R. Dehak, and J. Glass, "Unsupervised Speaker Adaptation Based on the Cosine Similarity for Text-Independent Speaker Verification," Proc. IEEE Odyssey Workshop, Brno, Czech Republic, June 2010. (PDF)

M. Senoussaoui, P. Kenny, N. Dehak, and P. Dumouchel, "An i-Vector Extractor Suitable for Speaker Recognition with Both Microphone and Telephone Speech," Proc. IEEE Odyssey Workshop, Brno, Czech Republic, June 2010. (PDF)

I. McGraw, C. Lee, L. Hetherington, S. Seneff, and J. Glass, "Collecting Voices from the Cloud," Proc. LREC, Malta, May 2010. (PDF)

S. Teller, M. Walker, M. Antone, A. Correa, R. Davis, L. Fletcher, E. Frazzoli, J. Glass, J. How, A. S. Huang, J. Jeon, S. Karaman, B. Luders, N. Roy, T. Sainath, "A Voice-Commandable Robotic Forklift Working Alongside Humans in Minimally-Prepared Outdoor Environments," Proc. ICRA, Anchorage, Alaska, United States, May 2010. (PDF)

J. Liu, S. Seneff, and V. Zue, "Dialogue-Oriented Review Summary Generation for Spoken Dialogue Recommendation Systems," Proc. NAACL-HLT, Los Angeles, California, United States, March 2010. (PDF)

Y. Zhang and J. Glass, "Towards Multi-Speaker Unsupervised Speech Pattern Discovery," Proc. ICASSP, pp. 4366-4369, Dallas, Texas, United States, March 2010. (PDF)

J. Ming, T. J. Hazen, J. R. Glass, "Combining Missing-Feature Theory, Speech Enhancement, and Speaker-Dependent/-Independent Modeling for Speech Separation," Computer Speech and Language 24, January 2010, pp. 67-76. (PDF)

2009

Y. Xu and S. Seneff, "Speech-Based Interactive Games for Language Learning: Reading, Translation, and Question-Answering," International Journal of Computational Linguistics and Chinese Language Processing, vol. 14, no. 2 (2009) (PDF)

T. N. Sainath, "Island-Driven Search Using Broad Phonetic Classes," Proc. ASRU, Merano, Italy, December 2009. (PDF)

Y. Zhang, and J. Glass, "Unsupervised Spoken Keyword Spotting via Segmental DTW on Gaussian Posteriorgrams," Proc. ASRU, Merano, Italy, December 2009. (PDF)

K. Saenko, K. Livescu, J. Glass, and T. Darrell, "Multistream Articulatory Feature-Based Models for Visual Speech Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 9, pp. 1700 - 1701, September 2009. (PDF)

I. McGraw, A. Gruenstein, and A. Sutherland, "A Self-Labeling Speech Corpus: Collecting Spoken Words with an Online Educational Game," Proc. Interspeech, Brighton, UK, September 2009. (PDF)

H. Chang and J. Glass, "A Back-off Discriminative Acoustic Model for Automatic Speech Recognition," Proc. Interspeech, Brighton, UK, September 2009. (PDF)

M. Peabody and S. Seneff, "Annotation and Features of Non-native Mandarin Tone Quality," Proc. Interspeech, Brighton, UK, September 2009. (PDF)

A. Gruenstein, I. McGraw, and A. Sutherland, "A Self-Transcribing Speech Corpus: Collecting Continuous Speech with an Online Educational Game," Proc. SIGSLaTe, Warwickshire, England, September 2009. (PDF)

B. Yoshimoto, I. McGraw, and S. Seneff, "Rainbow Rummy: A Web-based Game for Vocabulary Acquisition using Computer-directed Speech," Proc. SIGSLaTe, Warwickshire, England, September 2009. (PDF)

Y. Xu, A. Goldie, and S. Seneff, "Automatic Question Generation and Answer Judging: A Q&A Game for Language Learning," Proc. SIGSLaTE, Warwickshire, England, September 2009. (PDF)

J. Liu and S. Seneff, "Review Sentiment Scoring via a Parse-and-Paraphrase Paradigm," Proc. EMNLP, Singapore, August 2009. (PDF)

J. M. Baker, L. Deng, S. Khudanpur, C. Lee, J. Glass, N. Morgan, and D. O'Shaughnessy, "Updated MINDS Report on Speech Recognition and Understanding, Part 2," IEEE Signal Processing Magazine, pp. 79-86, July 2009. (PDF)

J. M. Baker, L. Deng, J. Glass, S. Khudanpur, C. Lee, N. Morgan, and D. O'Shaughnessy, "Research Developments and Directions in Speech Recognition and Understanding, Part 1," IEEE Signal Processing Magazine, pp. 75-80, May 2009. (PDF)

A. Gruenstein, J. Orszulak, S. Liu, S. Roberts, J. Zabel, B. Reimer, B. Mehler, S. Seneff, J. Glass, J. Coughlin, "City Browser: Developing a Conversational Automotive HMI," Proc. CHI, 4291-4296, Boston, April 2009. (PDF)

I. Badr, R. Zbib, and J. Glass, "Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Translation," Proc. EACL, 86-93, Athens, April 2009. (PDF)

H. Chang and J. Glass, "Discriminative Training of Hierarchical Acoustic Models for Large Vocabulary Continuous Speech Recognition," Proc. ICASSP, Taipei, Taiwan, April 2009. (PDF)

B. Hsu and J. Glass, "Language Model Parameter Estimation Using User Transcriptions," Proc. ICASSP, Taipei, Taiwan, April 2009. (PDF)

Y. Zhang and J. Glass, "Speech Rhythm Guided Syllable Nuclei Detection," Proc. ICASSP, Taipei, Taiwan, April 2009. (PDF)

K. Livescu, B. Zhu, and J. Glass, "On the Phonetic Information in Ultrasonic Microphone Signals," Proc. ICASSP, Taipei, Taiwan, April 2009. (PDF)

D. Kanevsky, T. N. Sainath, and B. Ramabhadran, "A Generalized Family of Parameter Estimation Techniques," Proc. ICASSP, Taipei, Taiwan, April 2009. (PDF)

2008

I. McGraw, B. Yoshimoto, and S. Seneff, "Speech-enabled Card Games for Incidental Vocabulary Acquisition in a Foreign Language," Speech Communication 2008. (PDF)

J. Liu, Y. Xu, S. Seneff, and V. Zue, "CityBrowser II: A Multimodal Restaurant Guide in Mandarin," Proc. ISCSLP, Kunming, China, December 2008. (PDF)

Y. Xu and S. Seneff, "Mandarin Learning Using Speech and Language Technologies: A Translation Game in the Travel Domain," Proc. ISCSLP, Kunming, China, December 2008. (PDF)

Y. Xu, J. Liu, and S. Seneff, "Mandarin Language Understanding in Dialogue Context," Proc. ISCSLP, Kunming, China, December 2008. (PDF)

Y. Xu and S. Seneff, "Two-Stage Translation: A Combined Linguistic and Statistical Machine Translation Framework," Proc. AMTA, Waikiki, Hawaii, USA, October 2008. (PDF)

A. Gruenstein, I. McGraw, and I. Badr, "The WAMI Toolkit for Developing, Deploying, and Evaluating Web-Accessible Multimodal Interfaces," Proc. ICMI, Chania, Crete, Greece, October 2008. (PDF)

B. Hsu and J. Glass, "N-gram Weighting: Reducing Training Data Mismatch in Cross-Domain Language Model Estimation," Proc. EMNLP, Honolulu, Hawaii, USA, October 2008. (PDF)

B. Hsu and J. Glass, "Iterative Language Model Estimation: Efficient Data Structure & Algorithms," Proc. Interspeech, Brisbane, Australia, September 2008. (PDF)

T. N. Sainath and V. Zue, "A Comparison of Broad Phonetic and Acoustic Units for Noise Robust Segment-Based Speech Recognition," Proc. Interspeech, Brisbane, Australia, September 2008. (PDF)

D. Kanevsky, T. N. Sainath, B. Ramabhadran, and D. Nahamoo, "Generalization of Extended Baum-Welch Parameter Estimation for Discriminative Training and Decoding," Proc. Interspeech, Brisbane, Australia, September 2008. (PDF)

I. McGraw and S. Seneff, "Speech-enabled Card Games for Language Learners," Proc. AAAI, Chicago, Illinois, USA, July 2008. (PDF)

A. Gruenstein, B. Hsu, J. Glass, S. Seneff, I. Hetherington, S. Cyphers, I. Badr, C. Wang, and S. Liu, "A Multimodal Home Entertainment Interface via a Mobile Device", Proc. of ACL Workshop on Mobile Language Processing, Columbus, Ohio, USA, June 2008. (PDF)

A. Gruenstein, "Response-Based Confidence Annotation for Spoken Dialogue Systems", Proc. of SIGdial Workshop on Discourse and Dialogue, Columbus, Ohio, USA, June 2008. (PDF)

J. Lee and S. Seneff, "Correcting Misuse of Verb Forms," Proc. ACL, Columbus, Ohio, USA, June 2008. (PDF)

I. Badr, R. Zbib, and J. Glass, "Segmentation for English-to-Arabic Statistical Machine Translation", Proc. ACL, Columbus, Ohio, USA, June 2008. (PDF)

T. N. Sainath, D. Kanevsky, and B. Ramabhadran, "Gradient Steepness Metrics using Extended Baum-Welch Transformations for Universal Pattern Recognition Tasks," Proc. ICASSP, Las Vegas, Nevada, USA, April 2008. (PDF)

G. Choueiter, M. Ohannessian, S. Seneff, and J. Glass, "A Turbo-Style Algorithm for Lexical Baseforms Estimation", Proc. ICASSP, Las Vegas, Nevada, USA, April 2008. (PDF)

G. Choueiter, G. Zweig, and P. Nguyen, "An Empirical Study of Automatic Accent Classification", Proc. ICASSP, Las Vegas, Nevada, USA, April 2008. (PDF)

J. Lee and O. Knutsson, "The Role of PP Attachment in Preposition Generation", Proc. CICLing, Haifa, Israel, February 2008. (PDF)

A. Park and J. Glass, "Unsupervised Pattern Discovery in Speech", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 16, No. 1, January 2008. (PDF)

2007

K. Schutte and J. Glass, "Speech Recognition with Localized Time-Frequency Pattern Detectors," Proc. ASRU, Kyoto, Japan, December 2007. (PDF)

G. Choueiter, S. Seneff, and J. Glass, "Automatic Lexical Pronunciations Generation and Update", Proc. ASRU, Kyoto, Japan, December 2007. (PDF)

T. Sainath, D. Kanevsky, and B. Ramabhadran, "Broad Phonetic Class Recognition in a Hidden Markov Model Framework Using Extended Baum-Welch Transformations", Proc. ASRU, Kyoto, Japan, December 2007. (PDF)

B. Hsu, "Generalized Linear Interpolation of Language Models", Proc. ASRU, 136-140, Kyoto, Japan, December 2007. (PDF)

H. Chang and J. Glass, "Hierarchical Large-Margin Gaussian Mixture Models for Phonetic Classification", Proc. ASRU, Kyoto, Japan, December 2007. (PDF)

I. McGraw and S. Seneff, "Immersive Second Language Acquisition in Narrow Domains: A Prototype ISLAND Dialogue System", Proc. of the Speech and Language Technology in Education (SLaTE) Workshop, Farmington, Pennsylvania, October 2007. (PDF)

C. Chao, S. Seneff, and C. Wang, "An Interactive Interpretation Game for Learning Chinese", Proc. of the Speech and Language Technology in Education (SLaTE) Workshop, Farmington, Pennsylvania, October 2007. (PDF)

S. Seneff, "Web-based Dialogue and Translation Games for Spoken Language Learning", Proc. of the Speech and Language Technology in Education (SLaTE) Workshop, Farmington, Pennsylvania, October 2007. (PDF)

A. Gruenstein and S. Seneff, "Releasing a Multimodal Dialogue System into the Wild: User Support Mechanisms", Proc. of the 8th SIGdial Workshop on Discourse and Dialogue, Antwerp, Belgium, pp. 111-119, September 2007. (PDF)

V. Zue, "On Organic Interfaces", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

G. Choueiter, S. Seneff, and J. Glass, "New Word Acquisition Using Subword Modeling", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

J. Frankel, M. Magimai-Doss, S. King, K. Livescu and O. Cetin, "Articulatory Feature Classifiers Trained on 2000 hours of Telephone Speech", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

J. Glass, T. J. Hazen, S. Cyphers, I. Malioutov, D. Huynh, and R. Barzilay, "Recent Progress in the MIT Spoken Lecture Processing Project", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

T. J. Hazen, B. Sherry, and M. Adler, "Speech-Based Annotation and Retrieval of Digital Photographs", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

T. J. Hazen and D. Schultz, "Multi-Modal User Authentication from Video for Mobile or Variable-Environment Applications," Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

T. J. Hazen and E. McDermott, "Discriminative MCE-Based Speaker Adaptation of Acoustic Models for a Spoken Lecture Processing Task", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

I. Hetherington, "PocketSUMMIT: Small-Footprint Continuous Speech Recognition", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

J. Lee and S. Seneff, "Automatic Generation of Cloze Items for Prepositions", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

T. Sainath, V. Zue, and D. Kanevsky, "Audio Classification using the Extended Baum-Welch Transformations", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

N. Singh-Miller, M. Collins, and T. J. Hazen, "Dimensionality Reduction for Speech Recognition Using Neighborhood Components Analysis", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

H. Wu and S. Seneff, "Reducing Recognition Error Rate based on Context Relationships among Dialogue Turns", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

B. Zhu, T. J. Hazen, and J. Glass, "Multimodal Speech Recognition with Ultrasonic Sensors", Proc. Interspeech, Antwerp, Belgium, August 2007. (PDF)

M. Hasegawa, K. Livescu, P. Lal, and K. Saenko, "Audiovisual Speech Recognition with Articulator Positions as Hidden Variables", Proc. International Congress of Phonetic Sciences, Saarbruecken, Germany, August 2007. (PDF)

C. Wang and S. Seneff, "A Spoken Translation Game for Second Language Learning", Proc. AIED, Marina del Rey, California, July 2007. (PDF)

I. Malioutov, A. Park, R. Barzilay, and J. Glass, "Making Sense of Sound: Unsupervised Topic Segmentation over Acoustic Input", Proc. ACL, Prague, Czech Republic, June 2007. (PDF)

J. Lee, "A Computational Model of Text Reuse in Ancient Literary Texts", Proc. ACL, Prague, Czech Republic, June 2007. (PDF)

G. Sun, X. Liu, G. Cong, M. Zhou, Z. Xiong, J. Lee, and C. Lin, "Detecting Erroneous Sentences using Automatically Mined Sequential Patterns", Proc. ACL, Prague, Czech Republic, June 2007. (PDF)

C. Wang, M. Collins, and P. Koehn, "Chinese Syntactic Reordering for Statistical Machine Translation", Proc. EMNLP, Prague, Czech Republic, June 2007. (PDF)

S. Seneff, M. Adler, J. Glass, B. Sherry, T. J. Hazen, C. Wang, and T. Wu, "Exploiting Context Information in Spoken Dialogue Interaction with Mobile Devices", Proc. International Workshop on Improved Mobile User Experience (IMUx), Toronto, Canada, May 2007. (PDF)

T. Sainath, D. Kanevsky, and G. Iyengar, "Unsupervised Audio Segmentation Using Extended Baum-Welch Transformations", Proc. ICASSP, Honolulu, Hawaii, April 2007. (PDF)

T. Hori, I. L. Hetherington, T. J. Hazen, and J. Glass, "Open-Vocabulary Spoken Utterance Retrieval Using Confusion Networks", Proc. ICASSP, Honolulu, Hawaii, April 2007. (PDF)

R. Rifkin, K. Schutte, M. Saad, J. Bouvrie, and J. Glass, "Noise Robust Phonetic Classification with Linear Regularized Least Squares and Second-Order Features", Proc. ICASSP, Honolulu, Hawaii, April 2007. (PDF)

K. Livescu, O. Cetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, A. Bezman, S. Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss, K. Saenko, "Articulatory Feature-based Methods for Acoustic and Audio-visual Speech Recognition: Summary from the 2006 JHU Summer Workshop", Proc. ICASSP, Honolulu, Hawaii, April 2007. (PDF)

K. Livescu, A. Bezman, N. Borges, L. Yung, O. Cetin, J. Frankel, S. King, M. Magimai-Doss, X. Chi, and L. Lavoie, "Manual Transcription of Conversational Speech at the Articulatory Feature Level", Proc. ICASSP, Honolulu, Hawaii, April 2007. (PDF)

O. Cetin, A. Kantor, S. King, C. Bartels, M. Magimai-Doss, J. Frankel, and K. Livescu, "An Articulatory Feature-based Tandem Approach and Factored Observation Modeling", Proc. ICASSP, Honolulu, Hawaii, April 2007. (PDF)

C. Wang and S. Seneff, "Automatic Assessment of Student Translations for Foreign Language Tutoring", Proc. HLT-NAACL, Rochester, NY, April 2007. (PDF)

J. Lee, M. Zhou, and X. Liu, "Detection of Non-native Sentences using Machine-translated Training Data", Proc. HLT-NAACL (Short Papers), Rochester, NY, April 2007. (PDF)

S. Seneff, C. Wang, and C. Chao, "Spoken Dialogue Systems for Language Learning", Proc. HLT-NAACL, Rochester, NY, April 2007. (PDF)

2006

B. Hsu and J. Glass, "Spoken Correction for Chinese Text Entry," Proc. 5th International Symposium on Chinese Spoken Language Processing (ISCSLP), Kent Ridge, Singapore, December 2006. (PDF)

M. Peabody, and S. Seneff, "Towards Automatic Tone Correction in Non-native Mandarin," Proc. 5th International Symposium on Chinese Spoken Language Processing (ISCSLP), Kent Ridge, Singapore, December 2006. (PDF)

A. Gruenstein and S. Seneff, "Context-Sensitive Language Modeling for Large Sets of Proper Nouns in Multimodal Dialogue Systems," Proc. IEEE/ACL 2006 Workshop on Spoken Language Technology, Palm Beach, Aruba, December 2006. (PDF)

A. Park and J. Glass, "A Novel DTW-based Distance Measure for Speaker Segmentation," Proc. IEEE/ACL 2006 Workshop on Spoken Language Technology, Palm Beach, Aruba, December 2006. (PDF)

K. Saenko and K. Livescu, "An Asynchronous DBN for Audio-Visual Speech Recognition," Proc. IEEE/ACL 2006 Workshop on Spoken Language Technology, Palm Beach, Aruba, December 2006. (PDF)

A. Gruenstein, S. Seneff, and C. Wang, "Scalable and Portable Web-Based Multimodal Dialogue Interaction with Geographical Database," Proc. Interspeech, Pittsburgh, Pennsylvania, September 2006. (PDF)

T. J. Hazen, "Automatic Alignment and Error Correction of Human Generated Transcripts for Long Speech Recordings," Proc. Interspeech, Pittsburgh, Pennsylvania, September 2006. (PDF)

J. Lee and S. Seneff, "Automatic Grammar Correction for Second-Language Learners," Proc. Interspeech, Pittsburgh, Pennsylvania, September 2006. (PDF)

J. Ming, T. J. Hazen, and J. Glass, "Combining Missing-Feature Theory, Speech Enhancement and Speaker-Dependent/-Independent Modeling for Speech Separation," Proc. Interspeech, Pittsburgh, Pennsylvania, September 2006. (PDF)

C. Wang and S. Seneff, "High-Quality Speech Translation in the Flight Domain," Proc. Interspeech, Pittsburgh, Pennsylvania, September 2006. (PDF)

Y. Wang, A. Acero, M. Mahajan, and J. Lee, "Combining Statistical and Knowledge-Based Spoken Language Understanding in Conditional Models," Proc. COLING/ACL, Sydney, Australia, July 2006. (PDF)

B. Hsu and J. Glass, "Style & Topic Language Model Adaptation Using HMM-LDA," Proc. EMNLP, Sydney, Australia, July 2006. (PDF)

E. Filisko and S. Seneff, "Learning Decision Models in Spoken Dialogue Systems via User Simulation," Proc. AAAI Workshop on Statistical and Empirical Approaches for Spoken Dialog Systems, Boston, Massachusetts, July 2006. (PDF)

R. Woo, A. Park, and T. J. Hazen, "The MIT Mobile Device Speaker Verification Corpus: Data collection and preliminary experiments," Proceedings of Odyssey 2006, The Speaker and Language Recognition Workshop, June 2006. (PDF)

G. Choueiter, D. Povey, S.F. Chen, and G. Zweig, "Morpheme-Based Language Modeling for Arabic LVCSR," Proc. ICASSP 2006, Toulouse, France, May 2006. (PDF)

I. L. Hetherington, H. Shu, and J. Glass, "Flexible Multi-Stream Framework for Speech Recognition Using Multi-Tape Finite-State Transducers," Proc. ICASSP 2006, Toulouse, France, May 2006. (PDF)

J. Ming, T. J. Hazen, and J. Glass, "Speaker Verification Over Handheld Devices with Realistic Noisy Speech Data," Proc. ICASSP 2006, Toulouse, France, May 2006. (PDF)

A. Park and J. Glass, "Unsupervised Word Acquisition from Speech Using Pattern Discovery," Proc. ICASSP 2006, Toulouse, France, May 2006. (PDF)

T. N. Sainath and T. J. Hazen, "A Sinusoidal Model Approach to Acoustic Landmark Detection and Segmentation for Robust Segment-Based Speech Recognition," Proc. ICASSP 2006, Toulouse, France, May 2006. (PDF)

Y. Wang, J. Lee, and A. Acero, "Speech Utterance Classification Model Training Without Manual Transcriptions," Proc. ICASSP 2006, Toulouse, France, May 2006. (PDF)

2005

J. Lee and S. Seneff, "Interlingua-Based Translation for Language Learning Systems," Proc. ASRU, 133-138, San Juan, Puerto Rico, December 2005. (PDF)

A. Park and J. Glass, "Towards Unsupervised Pattern Discovery in Speech," Proc. ASRU, 53-58, San Juan, Puerto Rico, December 2005. (PDF)

A. Gruenstein, C. Wang, and S. Seneff, "Context-Sensitive Statistical Language Modeling," Proc. Interspeech, 17-20, Lisbon, Portugal, September 2005. (PDF)

I. Lee Hetherington, "A Multi-Pass, Dynamic-Vocabulary Approach to Real-Time, Large-Vocabulary Speech Recognition," Proc. Interspeech, 545-548, Lisbon, Portugal, September 2005. (PDF)

K. Schutte and J. Glass, "Robust Detection of Sonorant Landmarks," Proc. Interspeech, 1005-1008, Lisbon, Portugal, September 2005. (PDF)

C. Wang, S. Seneff, and G. Chung, "Language Model Data Filtering via User Simulation and Dialogue Resynthesis," Proc. Interspeech, 21-24, Lisbon, Portugal, September 2005. (PDF)

O. Scharenborg and S. Seneff, "A Two-Pass for Strategy Handling OOVs in a Large Vocabulary Recognition Task," Proc. Interspeech, 1669-1672, Lisbon, Portugal, September 2005. (PDF)

G. Chung, S. Seneff, and C. Wang, "Automatic Induction of Language Model Data for a Spoken Dialogue System," Proc. SIGDIAL, Lisbon, Portugal, September 2005. (PDF)

A. Gruenstein, J. Niekrasz, and M. Purver, "Meeting Structure Annotation: Data and Tools," Proc. SIGDIAL, Lisbon, Portugal, September 2005. (PDF)

E. Filisko and S. Seneff, "Developing City Name Acquisition Strategies in Spoken Dialogue Systems Via User Simulation," Proc. SIGDIAL, Lisbon, Portugal, September 2005. (PDF)

T. Hazen, L. Hetherington, H. Shu, and K. Livescu, "Pronunciation modeling using a finite-state transducer representation," Speech Communication. Vol. 46, No. 2, pp. 189-203, June, 2005. (Preprint PDF) (Speech Communication Home Page)

G. Choueiter and J. Glass, "A Wavelet and Filter Bank Framework for Phonetic Classification," Proc. ICASSP, Philadelphia, March 2005. (PDF)

M. Hasegawa-Johnson, J. Baker, S. Borys, K. Chen, E. Coogan, S. Greenberg, A. Juneja, K. Kirchhoff, K. Livescu, K. Sonmez, S. Mohan, J. Muller, and T. Wang, "Landmark-based speech recognition: Report of the 2004 Johns Hopkins Summer Workshop," Proc. ICASSP, Philadelphia, March 2005. (PDF)

A. Park, T. Hazen, and J. Glass, "Automatic Processing of Audio Lectures for Information Retrieval: Vocabulary Selection and Language Modeling," Proc. ICASSP, Philadelphia, March 2005. (PDF)

K. Saenko, K. Livescu, J. Glass, and T. Darrell, "Production Domain Modeling of Pronunciation for Visual Speech Recognition," Proc. ICASSP, Philadelphia, March 2005. (PDF)

S. Sakai, "Additive Modeling of English F0 Contour for Speech Synthesis," Proc. ICASSP, Philadephia, March 2005. (PDF)

S. Sakai, "Fundamental Frequency Modeling for Speech Synthesis Based on a Statistical Learning Technique," IEICE Transactions on Information and Systems, pp. 489-495, March 2005. (PDF)

2004

J. Glass, E. Weinstein, S. Cyphers, J. Polifroni, G. Chung, and M. Nakano, "A Framework for Developing Conversational User Interfaces," Proc. CADUI, 354-365, Funchal, Portugal, January 2004. (PDF)

S. Seneff, "The Use of Subword Linguistic Modelling for Multiple Tasks in Speech Recognition," Speech Communication, Vol. 42, No. 3-4, pp. 373-390, April 2004. (PDF)

J. Glass, T. Hazen, L. Hetherington and C. Wang, "Analysis and processing of lecture audio data: Preliminary investigations", Proc. HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, 9-12, Boston, MA, May, 2004. (PDF)

E. Filisko and S. Seneff, "Error Detection and Recovery in Spoken Dialogue Systems," Proc. HLT-NAACL 2004 Workshop on Spoken Language Understanding for Conversational Systems, 31-38, Boston, MA, May, 2004. (PDF)

P. Boda and E. Filisko, "Virtual Modality: a Framework for Testing and Building Multimodal Applications," Proc. HLT-NAACL 2004 Workshop on Spoken Language Understanding for Conversational Systems, 17-24, Boston, MA, May, 2004. (PDF)

K. Livescu and J. Glass, "Feature-based pronunciation modeling for speech recognition." Proc. HLT/NAACL, Boston, MA, May 2004. (PDF)

J. Lee, "Automatic Article Restoration," Proc. HLT-NAACL 2004 Student Research Workshop, Boston, MA, 195-200, May, 2004. (PDF)

E. McDermott and T. Hazen, "Minimum classification error training of landmark models for real-time continuous speech recognition," Proc. ICASSP, 937-940, Montreal, Quebec, May, 2004. (PDF)

S. Sakai, "F0 Modeling with Multi-Layer Additive Modeling Based on a Statistical Learning Technique," Proc. ISCA Speech Synthesis Workshop, 151-154, Pittsburgh, PA, June 2004. (PDF)

C. Wang and S. Seneff, "High-quality Speech Translation for Language Learning," Proc. InSTIL Symposium on Computer Assisted Language Learning, 99-102, Venice, Italy, 2004. (PDF)

S. Seneff, C. Wang, and J. Zhang, "Spoken Conversational Interaction for Language Learning," Proc. InSTIL Symposium on Computer Assisted Language Learning, 151-154, Venice, Italy, 2004. (PDF)

M. Peabody, S. Seneff, and C. Wang, "Mandarin Tone Acquisition through Typed Dialogues," Proc. InSTIL Symposium on Computer Assisted Language Learning, 173--176, Venice, Italy, 2004. (PDF)

J. Lee and S. Seneff, "Translingual Grammar Induction," Proc. Interspeech, 724-727, Jeju, South Korea, October 2004. (PDF)

J-M Kim, C. Wang, M. Peabody, and S. Seneff, "An Interactive English Pronunciation Dictionary for Korean Learners," Proc. Interspeech, 1145-1148, Jeju, South Korea, October 2004. (PDF)

G. Chung, C. Wang, S. Seneff, E. Filisko, and M. Tang, "Combining Linguistic Knowledge and Acoustic Information in Automatic Pronunciation Lexicon Generation" Proc. Interspeech, 328-332, Jeju, South Korea, October 2004. (PDF)

G. Chung, S. Seneff, C. Wang, and L. Hetherington, "A Dynamic Vocabulary Spoken Dialogue Interface," Proc. Interspeech, 327-330, Jeju, South Korea, October 2004. (PDF)

K. Livescu and J. Glass, "Feature-based pronunciation modeling with trainable asynchrony probabilities," Proc. ICSLP, Jeju, South Korea, October 2004. (PDF)

L. Hetherington, "The MIT Finite-State Transducer Toolkit for Speech and Language Processing," Proc. ICSLP, Jeju, South Korea, October 2004. (PDF)

A. Park and T. J. Hazen, "A comparison of normalization and training approaches for ASR-dependent speaker identification," Proc. Interspeech, Jeju, South Korea, October, 2004. (PDF)

T.J. Hazen, E. Saenko, C.H. La and J. Glass, "A segment-based audio-visual speech recognizer: Data collection, development and initial experiments," Proc. ICMI, State College, PA, October 2004. (PDF)

E. Saenko, T. Darrell, and J. Glass, "Articulatory Features for Robust Visual Speech Recognition," Proc. ICMI, State College, PA, October 2004. (PDF)

2003

J. Glass, "A Probabilistic Framework for Segment-Based Speech Recognition," Computer Speech and Language 17, 137-152, 2003. (PDF)

J. Glass and S. Seneff, "Flexible and Personalizable Mixed-Initiative Dialogue Systems," Proc. HLT-NAACL Workshop on Research Directions in Dialogue Processing, Edmonton, Canada, May 2003. (PDF)

G. Chung, S. Seneff, and C. Wang, "Automatic Acquisition of Names Using Speak and Spell Mode in Spoken Dialogue Systems," Proc. HLT-NAACL 2003, Edmonton, Canada, May, 2003, pp. 197-200. (PDF)

E. Filisko and S. Seneff, "A Context Resolution Server for the Galaxy Conversational Systems," Proc. Eurospeech, 197-200, Geneva, Switzerland, September 2003. (PDF)

T. J. Hazen, D. A. Jones, A. Park, L. C. Kukolich, and
D. A. Reynolds, "Integration of Speaker Recognition into Conversational Spoken Dialogue Systems," Proc. Eurospeech, 1961-1964, Geneva, Switzerland, September 2003. (PDF)

J. Schalkwyk, I. Lee Hetherington, and E. Story, "Speech Recognition with Dynamic Grammars Using Finite-State Transducers," Proc. Eurospeech, 1969-1972, Geneva, Switzerland, September 2003. (PDF)

K. Livescu, J. Glass, and J. Bilmes, "Hidden Feature Models for Speech Recognition Using Dynamic Bayesian Networks," Proc. Eurospeech, 2529-2532, Geneva, Switzerland, September 2003. (PDF)

M. Nakano and T. J. Hazen, "Using Untranscribed User Utterances for Improving Language Models based on Confidence Scoring," Proc. Eurospeech, 417-420, Geneva, Switzerland, September 2003. (PDF)

J. Polifroni, G. Chung, and S. Seneff, "Towards the Automatic Generation of Mixed-Initiative Dialogue Systems from Web Content," Proc. Eurospeech, 193-196, Geneva, Switzerland, September 2003. (PDF)

S. Seneff, G. Chung, and C. Wang, "Empowering End Users to Personalize Dialogue Systems through Spoken Interaction," Proc. Eurospeech, 749-752, Geneva, Switzerland, September 2003. (PDF)

S. Seneff, C. Wang, and T. J. Hazen, "Automatic Induction of N-Gram Language Models from a Natural Language Grammar," Proc. Eurospeech, 641-644, Geneva, Switzerland, September 2003. (PDF)

M. Tang, S. Seneff, and V. Zue, "Modeling Linguistic Features in Speech Recognition," Proc. Eurospeech, 2585-2588, Geneva, Switzerland, September 2003. (PDF)

T. J. Hazen, E. Weinstein, and A. Park, "Towards Robust Person Recognition on Handheld Devices Using Face and Speaker Identification Technologies," Proc. ICMI, Vancouver, Canada, November 2003. (PDF)

T. J. Hazen, E. Weinstein, R. Kabir, A. Park, and
B. Heisele, "Multi-Modal Face and Speaker Identification on a Handheld Device," Proc. Workshop on Multimodal User Authentication, 113-120, Santa Barbara, California, December 2003. (PDF)

S. Sakai and J. Glass, "Fundamental Frequency Modeling for Corpus-Based Speech Synthesis Based on a Statistical Learning Technique," Proc. ASRU, 712-717, St. Thomas, U. S. Virgin Islands, December 2003. (PDF)

H. Shu, I. Lee Hetherington, and J. Glass, "Baum-Welch Training for Segment-Based Speech Recognition," Proc. ASRU, 43-48, St. Thomas, U. S. Virgin Islands, December 2003. (PDF)

M. Tang, S. Seneff, and V. Zue, "Two-Stage Continuous Speech Recognition Using Feature-Based Models: A Preliminary Study," Proc. ASRU, 49-54, St. Thomas, U. S. Virgin Islands, December 2003. (PDF)

2002

G. Zweig, J. Bilmes, T. Richardson, K. Filali, K. Livescu, P. Xu, K. Jackson, Y. Brandman, E. Sandness, E. Holtz, J. Torres, B. Byrne, "Structurally Discriminative Graphical Models for Automatic Speech Recognition: Results from the 2001 Johns Hopkins Summer Workshop," Proc. ICASSP, Orlando, Florida, June 2002. (PDF)

M. Tang, X. Luo, and S. Roukos, "Active Learning for Statistical Natural Language Parsing," Proc. ACL, Philadelphia, PA, July 2002. (PDF)

T. J. Hazen, S. Seneff, and J. Polifroni, "Recognition confidence scoring and its use in speech understanding systems," Computer Speech and Language, 16, 49-67, 2002. (PDF)

S. Seneff, "Response Planning and Generation in the MERCURY Flight Reservation System, "Computer Speech and Language, 16, 283-312, 2002. (PDF)

I. Bazzi and J. Glass, "A Multi-Class Approach for Modelling Out-of-Vocabulary Words," Proc. ICSLP, 1613-1616, Denver, CO, September 2002. (PDF)

G. Chung and S. Seneff, "Integrating Speech with Keypad Input for Automatic Entry of Spelling and Pronunciation of New Words," Proc. ICSLP, 2061-2064, Denver, CO, September 2002. (PDF)

T. J. Hazen, I. Lee Hetherington, H. Shu, and K. Livescu, "Pronunciation Modeling Using a Finite-State Transducer Representation," Proc. ISCA Workshop on Pronunciation Modeling and Lexicon Adaptation, 99-104, Estes Park, CO, September 2002. (PDF)

X. Mou, S. Seneff, and V. Zue, "Integration of Supra-Lexical Linguistic Models with Speech Recognition Using Shallow Parsing and Finite State Transducers," Proc. ICSLP, 1289-1292, Denver, CO, September 2002. (PDF)

A. Park and T. J. Hazen, "ASR Dependent Techniques for Speaker Identification," Proc. ICSLP, 1337-1340, Denver, CO, September 2002. (PDF)

J. Polifroni and G. Chung, "Promoting Portability in Dialogue Management," Proc. ICSLP, 2721-2724, Denver, CO, September 2002. (PDF)

E. Pusateri and T. J. Hazen, "Rapid Speaker Adaptation Using Speaker Clustering," Proc. ICSLP, 61-64, Denver, CO, September 2002. (PDF)

H. Shu and I. Lee Hetherington, "EM Training of Finite-State Transducers and Its Application to Pronunciation Modeling," Proc. ICSLP, 1293-1296, Denver, CO, September 2002. (PDF)

J. Yi and J. Glass, "Information-Theoretic Criteria for Unit Selection Synthesis," Proc. ICSLP, 2617-2620, Denver, CO, September 2002. (PDF)

2001

T.J. Hazen and I. Bazzi, "A Comparison and Combination of Methods for OOV Word Detection and Word Confidence Scoring," Proceedings ICASSP, Salt Lake City, UT, May 2001. (PDF)

X. Mou and V. Zue, "Sublexical Modelling Using a Finite State Transducer Framework," Proc. ICASSP, Salt Lake City, UT, May 2001. (PDF)

I. Bazzi and J. Glass, "Learning Units for Domain-Independent Out-of-Vocabulary Word Modelling," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

J. Glass and E. Weinstein, "SPEECHBUILDER: Facilitating Spoken Dialogue Systems Development," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

M. Nakano, T. Minami, S. Seneff, T. J. Hazen, D. Scott Cyphers, J. Glass, J. Polifroni, V. Zue, "Mokusei: A Telephone-based Japanese Conversational System in the Weather Domain," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

T. J. Hazen, I. Lee Hetherington and A. Park, "FST-Based Recognition Techniques for Multi-Lingual and Multi-Domain Spontaneous Speech," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

I. Lee Hetherington ,"An Efficient Implementation of Phonological Rules using Finite-State Transducers," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

K. Livescu and J. Glass, "Segment-Based Recognition on the PhoneBook Task: Initial Results and Observations on Duration Modeling," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

X. Mou, S. Seneff and V. Zue, "Context-dependent Probabilistic Hierarchical Sub-lexical Modelling Using Finite State Transducers," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

M. Tang, C. Wang, and S. Seneff, "Voice Transformations: From Speech Synthesis to Mammalian Vocalizations," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

C. Wang and S. Seneff, "Lexical Stress Modeling for Improved Speech Recognition of Spontaneous Telephone Speech in the JUPITER Domain," Proc. Eurospeech, Aalborg, Denmark, September 2001. (PDF)

H. Dolfing and L. Hetherington, "Incremental Language Models for Speech Recognition using Finite-State Transducers," Proc. ASRU, Madonna di Campiglio, Italy, December 2001. (PDF).

2000

V. Zue, et al., "JUPITER: A Telephone-Based Conversational Interface for Weather Information," IEEE Transactions on Speech and Audio Processing, Vol. 8 , No. 1, January 2000. (PDF)

S. Seneff and J. Polifroni, "Dialogue Management in the Mercury Flight Reservation System," Proc. Dialogue Workshop, ANLP-NAACL, Seattle, April 2000. (PDF)

T. J. Hazen, "A comparison of novel techniques for rapid speaker adaptation," Speech Communication, Vol. 31 (2000), pp. 15-33, May 2000. (gzip'd PS) (PDF)

J. Polifroni and S. Seneff, "Galaxy-II as an Architecture for Spoken Dialogue Evaluation" Proc. LREC, Athens, Greece, May 2000. (PDF)

I. Bazzi and J. Glass, "Heterogeneous Lexical Units for Automatic Speech Recognition: Preliminary Investigations" Proc. ICASSP, Istanbul, Turkey, June 2000. (PDF)

S. Kamppari and T.J. Hazen, "Word and Phone Level Acoustic Confidence Scoring," Proc. ICASSP, Istanbul, Turkey, June 2000. (PDF)

K. Livescu and J. Glass, "Lexical Modeling of Non-native Speech for Automatic Speech Recognition," Proc. ICASSP, Istanbul, Turkey, June 2000. (PDF)

K. Ng, "Information Fusion for Spoken Document Retrieval," Proc. ICASSP, Istanbul, Turkey, June 2000. (PDF)

C. Wang and S. Seneff, "Robust Pitch Tracking for Prosodic Modeling in Telephone Speech," Proc. ICASSP, Istanbul, Turkey, June 2000. (PDF)

S. Seneff, J. Glass, T.J. Hazen, Y. Minami, J. Polifroni, and V. Zue, "MOKUSEI: A Japanese Spoken Dialogue System in the Weather Domain," NTT R&D Vol. 49, No. 7, 2000.

V. Zue and J. Glass, "Conversational Interfaces: Advances and Challenges" Proceedings of the IEEE, Special Issue on Spoken Language Processing, Vol. 88, August 2000. (PDF)

T. Hazen, T. Burianek, J. Polifroni and S. Seneff, "Recognition Confidence Scoring for Use in Speech Understanding Systems," Proc. ISCA Tutorial and Research Workshop: ASR2000, Paris, France, September 2000. (PDF)

V. Zue, et al., "From JUPITER to MOKUSEI: Multilingual Conversational Systems in the Weather Domain," Proc. Workshop on Multilingual Speech Communications (MSC2000), Kyoto, Japan, October 2000. (gzip'd PS)

L. Baptist and S. Seneff, "Genesis-II: A Versatile System for Language Generation in Conversational System Applications," Proc. ICSLP, Beijing, China October 2000. (PDF)

I. Bazzi and J. Glass, "Modeling Out-of-Vocabulary Words for Robust Speech Recognition" Proc. ICSLP, Beijing, China October 2000. (PDF)

I. Bazzi and D. Katabi, "Using Support Vector Machines for Spoken Digit Recognition," Proc. ICSLP, Beijing, China October 2000. (PDF)

G. Chung, "A Three-stage Solution for Flexible Vocabulary Speech Understanding," Proc. ICSLP, Beijing, China, October 2000. (PDF)

G. Chung, "Automatically Incorporating Unknown Words in Jupiter," Proc. ICSLP, Beijing, China, October 2000. (PDF)

J. Glass, J. Polifroni, S. Seneff and V. Zue, "Data Collection and Performance Evaluation of Spoken Dialogue Systems: The MIT Experience," Proc. ICSLP, Beijing, China October 2000. (PDF)

T.J. Hazen, T. Burianek, J. Polifroni and S. Seneff, "Integrating Recognition Confidence Scoring with Language Understanding and Dialogue Modeling," Proc. ICSLP, Beijing, China, October 2000. (PDF)

X. Mou and V. Zue, "The Use of Dynamic Reliability Scoring in Speech Recognition," Proc. ICSLP, Beijing, China, October 2000. (PDF)

E. Sandness and I.L. Hetherington, "Keyword-based Discriminative Training of Acoustic Models," Proc. ICSLP, Beijing, China, October 2000. (PDF)

S. Seneff, C. Chuu, and D. S. Cyphers, "Orion: From On-line Interaction to Off-line Delegation," Proc. ICSLP, Beijing, China, October 2000. (PDF)

S. Seneff and J. Polifroni, "Formal and Natural Language Generation in the Mercury Conversational System," Proc. ICSLP, Beijing, China, October 2000. (PDF)

N. Ström and S. Seneff, "Intelligent Barge-in in Conversational Systems," Proc. ICSLP, Beijing, China, October 2000. (PDF)

C. Wang and S. Seneff, "Improved Tone Recognition by Normalizing For Coarticulation and Intonation Effects." Proc. ICSLP, Beijing, China, October 2000. (PDF)

C. Wang, S. Cyphers, X. Mou, J. Polifroni, S. Seneff, J. Yi and V. Zue, "Muxing: A Telephone-Access Mandarin Conversational System," Proc. ICSLP, Beijing, China, October 2000. (PDF)

J. Yi, J. Glass and L. Hetherington, "A Flexible, Scalable Finite-State Transducer Architecture for Corpus-Based Concatenative Speech Synthesis," Proc. ICSLP, Beijing, China, October 2000. (PDF)

1999

J. Glass, T.J. Hazen and L. Hetherington, "Real-time telephone-based speech recognition in the JUPITER domain," Proc. ICASSP, Phoenix, AZ, March 1999. (PDF)

G. Chung and S. Seneff, "A Hierarchical Duration Model for Speech Recognition Based on the ANGIE Framework," Speech Communication, 27, 113-134, 1999. (gzip'd PS)

G. Chung, S. Seneff and I.L. Hetherington, "Towards Multi-Domain Speech Understanding Using a Two-Stage Recognizer," Proc. Eurospeech, Budapest, Hungary, September 1999. (PDF)

S. Seneff, R. Lau and J. Polifroni, "Organization, Communication, and Control in the GALAXY-II Conversational System," Proc. Eurospeech, Budapest, Hungary, September 1999. (PDF)

J. Glass, "Challenges for Spoken Dialogue Systems," Proc. ASRU, Keystone, CO, December 1999. (PDF)

N. Ström, L. Hetherington, T.J. Hazen, E. Sandness and J. Glass, "Acoustic Modeling Improvements in a Segment-Based Speech Recognizer," Proc. ASRU, Keystone, CO, December 1999. (PDF)

1998

T.J. Hazen, and A. Halberstadt, "Using Aggregation to Improve the Performance of Mixture Gaussian Acoustic Models," Proc. ICASSP, Seattle, WA, May 1998. (PDF)

K. Ng, and V. Zue, "Phonetic Recognition for Spoken Document Retrieval," Proc. ICASSP, Seattle, Wa, May 1998. (PDF)

J. Polifroni, S. Seneff, J. Glass, and T.J. Hazen, "Evaluation Methodology for a Telephone-based Conversational System," Proc. LREC, 42-50, Granada, Spain, May 1998. (PDF)

G. Chung and S. Seneff, "Improvements in Speech Understanding Accuracy through the Integration of Hierarchical Linguistic, Prosodic, and Phonological Constraints in the Jupiter Domain, " Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

J. Glass and T.J. Hazen, "Telephone-Based Conversational Speech Recognition in the Jupiter Domain, " Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

A. Halberstadt and J. Glass, "Heterogeneous Measurements and Multiple Classifiers for Speech Recognition," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

R. Lau and S. Seneff, "A Unified System for Sublexical and Linguistic Modelling Suporting Flexible Vocabulary Speech Understanding," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

S. Lee, and J. Glass, "Real-Time Probabilistic Segmentation for Segment-Based Speech Recogntion," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

C. Pao, P. Schmid, and J. Glass, "Confidence Scoring for Speech Understanding Systems," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

S. Seneff, "The Use of Linguistic Hierarchies in Speech Understanding," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

S. Seneff, E. Hurley, R. Lau, C. Pao, P. Schmid, and V. Zue, "Galaxy-II: A Reference Architecture for Conversational System Development," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

C. Wang and S. Seneff, "A Study of Tones and Tempo in Continuous Mandarin Digit Strings and their Application in Telephone Quality Speech Recognition," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)

J. Yi and J. Glass, "Natural-Sounding Speech Synthesis Using Variable-Length Units," Proc. ICSLP, Sydney, Australia, November 1998. (PDF)


32 Vassar Street
Cambridge, MA 02139 USA
(+1) 617.253.3049
 


©2013, Spoken Language Systems Group. All rights reserved.

About SLS
---Our Technologies
---Demonstration
Research Initiatives
---Technologies
---Applications
---Glossary
Publications
---Research Summary
---Theses
---Papers
---Archives
News and Events
---News Articles
---Archives
SLS People
---Research Staff
---Post-Doctoral Students
---Administrative Staff
---Support Staff
---Visitors
---Graduate Students
---Undergraduate Students
---Emeritus
---Positions with SLS
Contact Us
---Positions with SLS
---Visitor Information