首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >What is best for spoken language understanding: small but task-dependant embeddings or huge but out-of-domain embeddings?
【24h】

What is best for spoken language understanding: small but task-dependant embeddings or huge but out-of-domain embeddings?

机译:什么是最能理解口语的语言:小的但依赖于任务的嵌入,还是巨大的但超出域的嵌入?

获取原文

摘要

Word embeddings are shown to be a great asset for several Natural Language and Speech Processing tasks. While they are already evaluated on various NLP tasks, their evaluation on spoken or natural language understanding (SLU) is less studied. The goal of this study is two-fold: firstly, it focuses on semantic evaluation of common word embeddings approaches for SLU task; secondly, it investigates the use of two different data sets to train the embeddings: small and task-dependent corpus or huge and out-of-domain corpus. Experiments are carried out on 5 benchmark corpora (ATIS, SNIPS, SNIPS70, M2M, MEDIA), on which a relevance ranking was proposed in the literature. Interestingly, the performance of the embeddings is independent of the difficulty of the corpora. Moreover, the embeddings trained on huge and out-of-domain corpus yields to better results than the ones trained on small and task-dependent corpus.
机译:单词嵌入被证明是多项自然语言和语音处理任务的重要资产。尽管已经对各种NLP任务进行了评估,但对口语或自然语言理解(SLU)的评估却很少研究。这项研究的目的有两个方面:首先,它主要针对SLU任务中常用单词嵌入方法的语义评估;其次,它研究了如何使用两种不同的数据集来训练嵌入:小型和任务相关的语料库或巨大且域外的语料库。在5个基准语料库(ATIS,SNIPS,SNIPS70,M2M,MEDIA)上进行了实验,文献中提出了相关性排名。有趣的是,嵌入的性能与语料库的难度无关。此外,在庞大而领域外的语料库上训练的嵌入比在小且依赖任务的语料库上训练的嵌入产生更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号