首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Discriminative Reranking for Spoken Language Understanding
【24h】

Discriminative Reranking for Spoken Language Understanding

机译:区分语言语言重读

获取原文
获取原文并翻译 | 示例

摘要

Spoken language understanding (SLU) is concerned with the extraction of meaning structures from spoken utterances. Recent computational approaches to SLU, e.g., conditional random fields (CRFs), optimize local models by encoding several features, mainly based on simple n-grams. In contrast, recent works have shown that the accuracy of CRF can be significantly improved by modeling long-distance dependency features. In this paper, we propose novel approaches to encode all possible dependencies between features and most importantly among parts of the meaning structure, e.g., concepts and their combination. We rerank hypotheses generated by local models, e.g., stochastic finite state transducers (SFSTs) or CRF, with a global model. The latter encodes a very large number of dependencies (in the form of trees or sequences) by applying kernel methods to the space of all meaning (sub) structures. We performed comparative experiments between SFST, CRF, support vector machines (SVMs), and our proposed discriminative reranking models (DRMs) on representative conversational speech corpora in three different languages: the ATIS (English), the MEDIA (French), and the LUNA (Italian) corpora. These corpora have been collected within three different domain applications of increasing complexity: informational, transactional, and problem-solving tasks, respectively. The results show that our DRMs consistently outperform the state-of-the-art models based on CRF.
机译:口语理解(SLU)与从语音中提取意义结构有关。 SLU的最新计算方法(例如条件随机字段(CRF))主要通过基于简单n元语法的几种特征编码来优化局部模型。相反,最近的工作表明,通过对长距离依赖特征建模可以显着提高CRF的准确性。在本文中,我们提出了一种新颖的方法来对特征之间的所有可能依赖关系进行编码,最重要的是对意义结构的各个部分之间的所有依赖关系进行编码,例如概念及其组合。我们用全局模型对由本地模型(例如,随机有限状态传感器(SFST)或CRF)生成的假设进行重新排序。后者通过将内核方法应用于所有含义(子)结构的空间来编码大量依赖项(以树或序列的形式)。我们在SFST,CRF,支持向量机(SVM)和我们提议的具有代表性的会话语音语料库的判别重排模型(DRM)之间进行了对比实验,该语言具有三种不同的语言:ATIS(英语),MEDIA(法语)和LUNA (意大利语)语料库。这些语料库已在越来越复杂的三个不同领域应用程序中收集:分别是信息,事务和解决问题的任务。结果表明,我们的DRM始终优于基于CRF的最新模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号