首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Invited Talk - Simultaneous Translation: Recent Advances and Remaining Challenges
【24h】

Invited Talk - Simultaneous Translation: Recent Advances and Remaining Challenges

机译:特邀演讲-同声翻译:最新进展和尚存的挑战

获取原文

摘要

Simultaneous interpretation (i.e., translating concurrently with the source language speech) is widely used in many scenarios including multilateral organizations (UN/EU), international summits (APEC/G-20), legal proceedings, and press conferences. However, it is well known to be one of the most challenging tasks for humans due to the simultaneous perception and production in two languages. As a result, there are only a few thousand professional simultaneous interpreters world-wide, and each of them can only sustain for 15-30 minutes in each turn. On the other hand, simultaneous translation (either speech-to-text or speech-to-speech) is also notoriously difficult for machines and has remained one of the holy grails of AI. A key challenge here is the word order difference between the source and target languages. For example, if you simultaneously translate German (an SOV language) to English (an SVO language), you often have to wait for the sentence-final German verb. Therefore, most existing "real-time" translation systems resort to conventional full-sentence translation, causing an undesirable latency of at least one sentence, rendering the audience largely out of sync with the speaker. There have been efforts towards genuine simultaneous translation, but with limited success. Recently, at Baidu Research, we discovered a much simpler and surprisingly effective approach to simultaneous (speech-to-text) translation by designing a "prefix-to-prefix" framework tailed to simultaneity requirements. This is in contrast with the "sequence-to-sequence" framework which assumes the availability of the full input sentence. Our approach results in the first simultaneous translation system that achieves reasonable translation quality with controllable latency. Our technique has been successfully deployed to simultaneously translate Chinese speeches into English subtitles at the 2018 Baidu World Conference, and has been demoed live at NeuIPS 2018 Expo Day. Inspired by the success of this very simple approach, we have extended it to produce more flexible translation strategies. Our work has also generated renewed interest in this long-standing problem in the CL community; for instance, two recent papers from Google proposed interesting improvements based on our ideas. Time permitting, I will also discuss our efforts towards the ultimate goal of simultaneous speech-to-speech translation, and conclude with a list of remaining challenges. See demos, media coverage, and more info at: https://simultrans-demo.github.io/ Bio: Liang Huang is Principal Scientist and Head of Institute of Deep Learning USA (IDL-US) at Baidu Research and Assistant Professor (on leave) at Oregon State University. He received his PhD from the University of Pennsylvania in 2008 and BS from Shanghai Jiao Tong University in 2003. He was previously a research scientist at Google, a research assistant professor at USC/ISI, an assistant professor at CUNY, a part-time research scientist at IBM. His research is in the theoretical aspects of computational linguistics. Many of his efficient algorithms in parsing, translation, and structured prediction have become standards in the field, for which he received a Best Paper Award at ACL 2008, a Best Paper Honorable Mention at EMNLP 2016, and several best paper nominations (ACL 2007, EMNLP 2008, ACL 2010, and SIGMOD 2018). He is also a computational biologist where he adapts his parsing algorithms to RNA and protein folding. He is an award-winning teacher and a best-selling author. His work has garnered widespread media attention including Fortune, CNBC, IEEE Spectrum, and MIT Technology Review.
机译:同声传译(即与源语言语音同时翻译)已在许多情况下广泛使用,包括多边组织(UN / EU),国际峰会(APEC / G-20),法律诉讼程序和新闻发布会。但是,由于同时用两种语言进行感知和产生,因此众所周知这是人类最具挑战性的任务之一。结果,世界范围内只有数千名专业同声传译员,他们每个回合只能维持15-30分钟。另一方面,众所周知,同时翻译(语音到文本或语音到语音)对于机器来说也非常困难,并且仍然是AI的神圣手段之一。这里的主要挑战是源语言和目标语言之间的字序差异。例如,如果您同时将德语(一种SOV语言)翻译为英语(一种SVO语言),则您通常必须等待最终的德语动词。因此,大多数现有的“实时”翻译系统诉诸于常规的全句子翻译,导致至少一个句子的不期望的等待时间,使得听众与说话者很大程度上不同步。一直在努力进行真正的同声翻译,但收效甚微。最近,在百度研究中心,我们通过设计尾部同时满足要求的“前缀到前缀”框架,发现了一种同时进行(语音到文本)翻译的简单得多且出乎意料的有效方法。这与“序列到序列”框架相反,后者假定完整输入句子的可用性。我们的方法产生了第一个同步翻译系统,该系统以可控制的延迟实现了合理的翻译质量。我们的技术已成功部署,可在2018百度世界会议上同时将中文语音翻译成英语字幕,并已在NeuIPS 2018世博会现场进行了演示。受这种非常简单的方法成功的启发,我们将其扩展为产生更灵活的翻译策略。我们的工作也引起了CL社区这个长期存在的问题的新兴趣;例如,谷歌最近发表的两篇论文根据我们的想法提出了有趣的改进。在时间允许的情况下,我还将讨论为实现语音到语音同时翻译的最终目标而做出的努力,并总结出一系列尚待解决的挑战。请访问以下网址查看演示,媒体报道和更多信息:https://simultrans-demo.github.io/生物:梁亮是百度研究中心首席研究员兼美国深度学习美国研究所(IDL-US)负责人兼助理教授(休假期间)在俄勒冈州立大学。他于2008年在宾夕法尼亚大学获得博士学位,并于2003年在上海交通大学获得学士学位。他之前是Google的研究科学家,USC / ISI的研究助理教授,CUNY的兼职研究助理教授。 IBM的科学家。他的研究领域是计算语言学的理论方面。他在解析,翻译和结构化预测方面的许多高效算法已成为该领域的标准,为此他在ACL 2008上获得了最佳论文奖,在EMNLP 2016上获得了最佳论文荣誉提名,并获得了数个最佳论文提名(ACL 2007, EMNLP 2008,ACL 2010和SIGMOD 2018)。他还是计算机生物学家,他将解析算法应用于RNA和蛋白质折叠。他是一位获奖老师和畅销书作家。他的工作赢得了包括Fortune,CNBC,IEEE Spectrum和MIT Technology Review在内的广泛媒体关注。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号