Spoken Language Understanding - Systems forExtracting Semantic Information from Speech
Buy Rights Online Buy Rights

Rights Contact Login For More Details

More About This Title Spoken Language Understanding - Systems forExtracting Semantic Information from Speech


Spoken language understanding (SLU) is an emerging field in between speech and language processing, investigating human/ machine and human/ human communication by leveraging technologies from signal processing, pattern recognition, machine learning and artificial intelligence. SLU systems are designed to extract the meaning from speech utterances and its applications are vast, from voice search in mobile devices to meeting summarization, attracting interest from both commercial and academic sectors.

Both human/machine and human/human communications can benefit from the application of SLU, using differing tasks and approaches to better understand and utilize such communications. This book covers the state-of-the-art approaches for the most popular SLU tasks with chapters written by well-known researchers in the respective fields. Key features include:

  • Presents a fully integrated view of the two distinct disciplines of speech processing and language processing for SLU tasks.
  • Defines what is possible today for SLU as an enabling technology for enterprise (e.g., customer care centers or company meetings), and consumer (e.g., entertainment, mobile, car, robot, or smart environments) applications and outlines the key research areas.
  • Provides a unique source of distilled information on methods for computer modeling of semantic information in human/machine and human/human conversations.

This book can be successfully used for graduate courses in electronics engineering, computer science or computational linguistics. Moreover, technologists interested in processing spoken communications will find it a useful source of collated information of the topic drawn from the two distinct disciplines of speech processing and language processing under the new area of SLU.


Gokhan Tur, Microsoft Research, California, USA
Dr Tur is currently a Principal Scientist in the Speech at Microsoft Department at Microsoft Research, Mountain View, California, USA. He was formerly a Research Scientist at SRI International which is an independent, nonprofit research institute conducting client-sponsored research and development for government agencies, commercial businesses, foundations, and other organizations. He has co-authored more than 70 journal and conference papers. Dr. Tur was the recipient of the Speech Communication Journal Best Paper awards by ISCA for 2004-2006 and by EURASIP for 2005-2006. He is a senior member of IEEE, ACL, and ISCA, and a member of IEEE Signal Processing Society (SPS), and Speech and Language Technical Committee (SLTC) for 2006-2008. He was a guest editor of Speech Communication (Elsevier) for a special issue on Spoken Language Understanding (SLU). He has been involved with organising various conferences and is the spoken language processing area chair for ICASSP 2009.

Renato De Mori, University of Avignon, France
Dr De Mori is a Professor of Computer Science at the Université d'Avignon, as well as Director of its Laboratoire d'Informatique and is a Visiting Professor at McGill University, Canada. He is a Fellow of the Computer Society of the IEEE and a Distinguished Lecturer of the IEEE Signal Processing Society. He is the author or editor of four books and has published more than 100 scientific papers in many international journals. Professor De Mori has been a member of the Executive Advisory Board at the IBM Toronto Lab, Scientific Advisor at France Télécom R&D, Chairman of the Computer and Information Systems Committee, Natural Sciences and Engineering Council of Canada, Vice-President R&D, Centre de Recherche en Informatique de Montréal.


List of Contributors.



1 Introduction (Gokhan Tur and Renato De Mori).

1.1 A Brief History of Spoken Language Understanding.

1.2 Organization of the Book.


2 History of Knowledge and Processes for Spoken Language Understanding (Renato De Mori).

2.1 Introduction.

2.2 Meaning Representation and Sentence Interpretation.

2.3 Knowledge Fragments and Semantic Composition.

2.4 Probabilistic Interpretation in SLU Systems.

2.5 Interpretation with Partial Syntactic Analysis.

2.6 Classification Models for Interpretation.

2.7 Advanced Methods and Resources for Semantic Modeling and Interpretation.

2.8 Recent Systems.

2.9 Conclusions.


3 Semantic Frame-based Spoken Language Understanding (Ye-Yi Wang, Li Deng and Alex Acero).

3.1 Background.

3.2 Knowledge-based Solutions.

3.3 Data-driven Approaches.

3.4 Summary.


4 Intent Determination and Spoken Utterance Classification (Gokhan Tur and Li Deng).

4.1 Background.

4.2 Task Description.

4.3 Technical Challenges.

4.4 Benchmark Data Sets.

4.5 Evaluation Metrics.

4.6 Technical Approaches.

4.7 Discussion and Conclusions.


5 Voice Search (Ye-Yi Wang, Dong Yu, Yun-Cheng Ju and Alex Acero).

5.1 Background.

5.2 Technology Review.

5.3 Summary.


6 Spoken Question Answering (Sophie Rosset, Olivier Galibert and Lori Lamel).

6.1 Introduction.

6.2 Specific Aspects of Handling Speech in QA Systems.

6.3 QA Evaluation Campaigns.

6.4 Question-answering Systems.

6.5 Projects Integrating Spoken Requests and Question Answering.

6.6 Conclusions.


7 SLU in Commercial and Research Spoken Dialogue Systems (David Suendermann and Roberto Pieraccini).

7.1 Why Spoken Dialogue Systems (Do Not) Have to Understand.

7.2 Approaches to SLU for Dialogue Systems.

7.3 From Call Flow to POMDP: How Dialogue Management Integrates with SLU.

7.4 Benchmark Projects and Data Sets.

7.5 Time is Money: The Relationship between SLU and Overall Dialogue System Performance.

7.6 Conclusion.


8 Active Learning (Dilek Hakkani-Tür and Giuseppe Riccardi).

8.1 Introduction.

8.2 Motivation.

8.3 Learning Architectures.

8.4 Active Learning Methods.

8.5 Combining Active Learning with Semi-supervised Learning.

8.6 Applications.

8.7 Evaluation of Active Learning Methods.

8.8 Discussion and Conclusions.



9 Human/Human Conversation Understanding (Gokhan Tur and Dilek Hakkani-Tür).

9.1 Background.

9.2 Human/Human Conversation Understanding Tasks.

9.3 Dialogue Act Segmentation and Tagging.

9.4 Action Item and Decision Detection.

9.5 Addressee Detection and Co-reference Resolution.

9.6 Hot Spot Detection.

9.7 Subjectivity, Sentiment, and Opinion Detection.

9.8 Speaker Role Detection.

9.9 Modeling Dominance.

9.10 Argument Diagramming.

9.11 Discussion and Conclusions.


10 Named Entity Recognition (Frédéric Béchet).

10.1 Task Description.

10.2 Challenges Using Speech Input.

10.3 Benchmark Data Sets, Applications.

10.4 Evaluation Metrics.

10.5 Main Approaches for Extracting NEs from Text.

10.6 Comparative Methods for NER from Speech.

10.7 New Trends in NER from Speech.

10.8 Conclusions.


11 Topic Segmentation (Matthew Purver).

11.1 Task Description.

11.2 Basic Approaches, and the Challenge of Speech.

11.3 Applications and Benchmark Datasets.

11.4 Evaluation Metrics.

11.5 Technical Approaches.

11.6 New Trends and Future Directions.


12 Topic Identification (Timothy J. Hazen).

12.1 Task Description.

12.2 Challenges Using Speech Input.

12.3 Applications and Benchmark Tasks.

12.4 Evaluation Metrics.

12.5 Technical Approaches.

12.6 New Trends and Future Directions.


13 Speech Summarization (Yang Liu and Dilek Hakkani-Tür).

13.1 Task Description.

13.2 Challenges when Using Speech Input.

13.3 Data Sets.

13.4 Evaluation Metrics.

13.5 General Approaches.

13.6 More Discussions on Speech versus Text Summarization.

13.7 Conclusions.


14 Speech Analytics (I. Dan Melamed and Mazin Gilbert)

14.1 Introduction.

14.2 System Architecture.

14.3 Speech Transcription.

14.4 Text Feature Extraction.

14.5 Acoustic Feature Extraction.

14.6 Relational Feature Extraction.

14.7 DBMS.

14.8 Media Server and Player.

14.9 Trend Analysis.

14.10 Alerting System.

14.11 Conclusion.


15 Speech Retrieval (Ciprian Chelba, Timothy J. Hazen, Bhuvana Ramabhadran and Murat Saraçlar).

15.1 Task Description.

15.2 Applications.

15.3 Challenges Using Speech Input.

15.4 Evaluation Metrics.

15.5 Benchmark Data Sets.

15.6 Approaches.

15.7 New Trends.

15.8 Discussion and Conclusions.




“The book also contains references to existing datasets that can be used by researchers interested in the field; these, together with the presented baseline, equip one with the necessary tools to step into this very daring and fascinating domain.”  (Zentralblatt MATH, 2012)