首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >What is best for spoken language understanding: small but task-dependant embeddings or huge but out-of-domain embeddings?
【24h】

What is best for spoken language understanding: small but task-dependant embeddings or huge but out-of-domain embeddings?

机译:什么是最好的口语理解:小但任务依赖的嵌入或巨大但域外嵌入式?

获取原文

摘要

Word embeddings are shown to be a great asset for several Natural Language and Speech Processing tasks. While they are already evaluated on various NLP tasks, their evaluation on spoken or natural language understanding (SLU) is less studied. The goal of this study is two-fold: firstly, it focuses on semantic evaluation of common word embeddings approaches for SLU task; secondly, it investigates the use of two different data sets to train the embeddings: small and task-dependent corpus or huge and out-of-domain corpus. Experiments are carried out on 5 benchmark corpora (ATIS, SNIPS, SNIPS70, M2M, MEDIA), on which a relevance ranking was proposed in the literature. Interestingly, the performance of the embeddings is independent of the difficulty of the corpora. Moreover, the embeddings trained on huge and out-of-domain corpus yields to better results than the ones trained on small and task-dependent corpus.
机译:Word Embeddings被证明是几种自然语言和语音处理任务的伟大资产。 虽然它们已经评估了各种NLP任务,但他们对口语或自然语言理解(SLU)的评估较少。 本研究的目标是两倍:首先,它侧重于对SLU任务的共同词嵌入方法的语义评估; 其次,它调查了两个不同的数据集来训练嵌入式:小型和任务依赖性语料库或巨大和域名语料库。 实验是在5个基准(ATIS,Snips,Snips70,M2M,Media)上进行的实验,其中在文献中提出了相关性排名。 有趣的是,嵌入式的表现与Group的难度无关。 此外,胚胎训练在巨大的域外语料库中,从培训的患者培训的巨大突出的语料库培训,而不是在小型和任务依赖性语料库上培训的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号