Speech understanding for spoken dialogue systems: From corpus harvesting to grammar rule induction

Elias Iosif; Ioannis Klasinas; Georgia Athanasopoulou; Elisavet Palogiannidi; Spiros Georgiladakis; Katerina Louka; Alexandras Potamianos

首页> 外文期刊>Computer speech and language >Speech understanding for spoken dialogue systems: From corpus harvesting to grammar rule induction

【24h】

Speech understanding for spoken dialogue systems: From corpus harvesting to grammar rule induction

机译：语音对话系统的语音理解：从语料库收集到语法规则归纳

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We investigate algorithms and tools for the semi-automatic authoring of grammars for spoken dialogue systems (SDS) proposing a framework that spans from corpora creation to grammar induction algorithms. A realistic human-in-the-loop approach is followed balancing automation and human intervention to optimize cost to performance ratio for grammar development. Web harvesting is the main approach investigated for eliciting spoken dialogue textual data, while crowdsourcing is also proposed as an alternative method. Several techniques are presented for constructing web queries and filtering the acquired corpora. We also investigate how the harvested corpora can be used for the automatic and semi-automatic (human-in-the-loop) induction of grammar rules. SDS grammar rules and induction algorithms are grouped into two types, namely, low- and high-level. Two families of algorithms are investigated for rule induction: one based on semantic similarity and distributional semantic models, and the other using more traditional statistical modeling approaches (e.g., slot-filling algorithms using Conditional Random Fields). Evaluation results are presented for two domains and languages. High-level induction precision scores up to 60% are obtained. Results advocate the portability of the proposed features and algorithms across languages and domains.

机译：我们研究了语音对话系统（SDS）的半自动语法创作算法和工具，提出了从语料库创建到语法归纳算法的框架。遵循一种现实的“在环”方法，在自动化和人为干预之间取得平衡，以优化语法开发的性价比。 Web收集是用于获取口语对话文本数据的主要方法，同时也建议使用众包作为替代方法。提出了几种技术来构造Web查询和过滤获取的语料库。我们还研究了如何将收集的语料库用于语法规则的自动和半自动（循环中的人工操作）归纳。 SDS语法规则和归纳算法分为两类，即低级和高级。研究了两种用于规则归纳的算法：一种基于语义相似性和分布式语义模型，另一种使用更传统的统计建模方法（例如，使用条件随机场的时隙填充算法）。给出了针对两个领域和语言的评估结果。获得了高达60％的高水平感应精度得分。结果证明了所建议的功能和算法在语言和领域之间的可移植性。

著录项

来源
《Computer speech and language》 |2018年第1期|272-297|共26页
作者
Elias Iosif; Ioannis Klasinas; Georgia Athanasopoulou; Elisavet Palogiannidi; Spiros Georgiladakis; Katerina Louka; Alexandras Potamianos;
展开▼
作者单位

School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece,'Athena ' - Research and Innovation Center in Information, Communication and Knowledge Technologies, 15125 Athens, Greece;

School of Electronic and Computer Engineering, Technical University of Crete, 73100 Chania, Greece;

School of Electronic and Computer Engineering, Technical University of Crete, 73100 Chania, Greece;

School of Electronic and Computer Engineering, Technical University of Crete, 73100 Chania, Greece;

School of Electronic and Computer Engineering, Technical University of Crete, 73100 Chania, Greece;

Voice Web S.A., 15124 Athens, Greece;

School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece,'Athena ' - Research and Innovation Center in Information, Communication and Knowledge Technologies, 15125 Athens, Greece;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Spoken dialogue systems; Grammar induction; Corpora creation; Semantic similarity; Web mining; Crowdsourcing;

机译：口语对话系统;语法归纳;语料库的创建;语义相似度;网络挖掘;众包;

相似文献

外文文献
中文文献
专利

1. Ranking Multiple Dialogue States by Corpus Statistics to Improve Discourse Understanding in Spoken Dialogue Systems [J] . Ryuichiro HIGASHINAKA, Mikio NAKANO IEICE Transactions on Information and Systems . 2009,第9期

机译：通过语料库统计对多个对话状态进行排名，以提高口语对话系统中的话语理解能力
2. Selecting Help Messages by Using Robust Grammar Verification for Handling Out-of-Grammar Utterances in Spoken Dialogue Systems [J] . Kazunori KOMATANI, Yuichiro FUKUBAYASHI, Satoshi IKEDA, IEICE transactions on information and systems . 2010,第12期

机译：通过使用健壮的语法验证来选择帮助消息，以处理口语对话系统中的语法外说话
3. Selecting Help Messages by Using Robust Grammar Verification for Handling Out-of-Grammar Utterances in Spoken Dialogue Systems [J] . Kazunori KOMATANI, Yuichiro FUKUBAYASHI, Satoshi IKEDA, IEICE Transactions on Information and Systems . 2010,第12期

机译：通过使用健壮的语法验证来选择帮助消息以处理口语对话系统中的语法外说话
4. Speech Understanding, Dialogue Management and Response Generation in Corpus-Based Spoken Dialogue System [C] . Keita HAYASHI, Yuki IRIE, Yukiko YAMAGUCHI, International Conference on Spoken Language Processing; 20041004-08; Jeju(KR) . 2004

机译：基于语料库的口语对话系统中的语音理解，对话管理和响应生成
5. Dialogue management in spoken dialogue systems with Degrees of Grounding. [D] . Roque, Antonio. 2009

机译：具有基础程度的语音对话系统中的对话管理。
6. Functionally Equivalent Variants in a Non-standard Variety and Their Implications for Universal Grammar: A Spontaneous Speech Corpus [O] . Evelina Leivada, Elena Papadopoulou, Natalia Pavlou -1

机译：非标准变体中的功能等效变体及其对通用语法的影响：自发语音语料库
7. tucSage: Grammar Rule Induction for Spoken Dialogue Systems via Probabilistic Candidate Selection [O] . Arodami Chorianopoulou, Georgia Athanasopoulou, Elias Iosif, 2014

机译：tucsage：通过概率候选人选择语音对话系统的语法规则归纳

Speech understanding for spoken dialogue systems: From corpus harvesting to grammar rule induction

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅