首页> 外文会议>Conference on empirical methods in natural language processing >Invited Speaker: Sharon Goldwater, University of Edinburgh:Towards more universal language technology: unsupervised learning from speech
【24h】

Invited Speaker: Sharon Goldwater, University of Edinburgh:Towards more universal language technology: unsupervised learning from speech

机译:特邀演讲者:爱丁堡大学的Sharon Goldwater:迈向更通用的语言技术:无监督的语音学习

获取原文

摘要

Speech and language processing has advanced enormously in the last decade, with successful applications in machine translation, voice-activated search, and even language-enabled personal assistants. Yet these systems typically still rely on learning from very large quantities of human-annotated data. These resource-intensive methods mean that effective technology is available for only a tiny fraction of the world's 7000 or so languages, mainly those spoken in large rich countries. This talk describes our recent work on developing unsupervised speech technology, where transcripts and pronunciation dictionaries are not used. The work is inspired by considering both how young infants may begin to acquire the sounds and words of their language, and how we might develop systems to help linguists analyze and document endangered languages. I will first present work on learning from speech audio alone, where the system must learn to segment the speech stream into word tokens and cluster repeated instances of the same word together to learn a lexicon of vocabulary items. The approach combines Bayesian and neural network methods to address learning at the word and sub-word levels.
机译:在过去的十年中,语音和语言处理取得了巨大的进步,在机器翻译,语音激活搜索甚至具有语言功能的个人助理中都得到了成功的应用。然而,这些系统通常仍然依赖于从大量的人类注释数据中学习。这些资源密集型方法意味着有效的技术仅可用于世界7000多种语言中的一小部分,主要是在富裕国家中使用的语言。本演讲描述了我们最近在开发无监督语音技术方面的工作,该技术不使用成绩单和发音词典。这项工作的灵感来自考虑幼儿如何开始获取其语言的声音和单词,以及我们如何开发系统来帮助语言学家分析和记录濒临灭绝的语言。我将首先介绍仅从语音音频中学习的工作,在该系统中,系统必须学会将语音流分割成单词标记,并将同一单词的重复实例聚在一起,以学习词汇表的词典。该方法结合了贝叶斯和神经网络方法来解决单词和子单词级别的学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号