首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Models of Visually Grounded Speech Signal Pay Attention to Nouns: A Bilingual Experiment on English and Japanese
【24h】

Models of Visually Grounded Speech Signal Pay Attention to Nouns: A Bilingual Experiment on English and Japanese

机译:视觉接地语音信号注意名词的模型:英语和日语双语实验

获取原文

摘要

We investigate the behaviour of attention in neural models of visually grounded speech trained on two languages: English and Japanese. Experimental results show that attention focuses on nouns and this behaviour holds true for two very typologically different languages. We also draw parallels between artificial neural attention and human attention and show that neural attention focuses on word endings as it has been theorised for human attention. Finally, we investigate how two visually grounded monolingual models can be used to perform cross-lingual speech-to-speech retrieval. For both languages, the enriched bilingual (speech-image) corpora with part-of-speech tags and forced alignments are distributed to the community for reproducible research.
机译:我们调查了英语和日语两种语言训练的基于视觉的语音神经模型中注意的行为。实验结果表明,注意力集中在名词上,这种行为在两种类型上截然不同的语言中也适用。我们还得出了人工神经注意和人类注意之间的相似之处,并表明神经注意集中于单词结尾,因为它已被理论化为人类注意。最后,我们研究如何将两个基于视觉的单语言模型用于执行跨语言语音到语音的检索。对于这两种语言,将带有词性标签和强制对齐的丰富的双语(语音图像)语料库分发给社区,以进行可重复的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号