The OPUS Resource Repository: An Open Package for Creating Parallel Corpora and Machine Translation Services

机译：Opus资源存储库：用于创建并行语料库和机器翻译服务的打开包

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a flexible and powerful system for creating parallel corpora and for running neural machine translation services. Our package provides a scalable data repository backend that offers transparent data pre-processing pipelines and automatic alignment procedures that facilitate the compilation of extensive parallel data sets from a variety of sources. Moreover, we develop a web-based interface that constitutes an intuitive frontend for end-users of the platform. The whole system can easily be distributed over virtual machines and implements a sophisticated permission system with secure connections and a flexible database for storing arbitrary metadata. Furthermore, we also provide an interface for neural machine translation that can run as a service on virtual machines, which also incorporates a connection to the data repository software.

机译：本文介绍了一个灵活而强大的系统，可创建平行语料库和运行神经机翻译服务。我们的包提供了一个可扩展的数据存储库后端，提供了透明数据预处理管道和自动对准程序，便于从各种来源编译广泛的并行数据集。此外，我们开发了一个基于Web的界面，构成了平台的最终用户的直观前端。整个系统可以轻松分布在虚拟机上，并实现具有安全连接的复杂的许可系统和用于存储任意元数据的灵活数据库。此外，我们还提供了一种用于神经计算机转换的界面，可以作为虚拟机上的服务运行，这也包含与数据存储库软件的连接。

著录项

来源
《Nordic conference of computational Linguistics》|2019年|xx 410 p.|共6页
会议地点
作者
Mikko Aulamo; J?rg Tiedemann;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词
入库时间 2022-08-20 20:19:22

相似文献

外文文献
中文文献
专利

1. Neural machine translation for low-resource languages without parallel corpora [J] . Alina Karakanta, Jon Dehdari, Josef van Genabith Machine translation . 2018,第1a2期

机译：无需并行语料库的低资源语言的神经机器翻译
2. Automatic induction of bilingual resources from aligned parallel corpora:application to shallow-transfer machine translation [J] . Helena M. Caseli, Maria das Gracas V. Nunes, Mikel L. Forcada Machine translation . 2006,第4期

机译：从对齐的并行语料库中自动提取双语资源：在浅传输机器翻译中的应用
3. Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text [J] . Antonio Jimeno Yepes, élise Prieur-Gaston, Aurélie Névéol BMC Bioinformatics . 2013,第1期

机译：将Medline和Publisher数据组合以创建并行语料库生物医学文本的自动翻译
4. The OPUS Resource Repository: An Open Package for Creating Parallel Corpora and Machine Translation Services [C] . Mikko Aulamo, Jörg Tiedemann Nordic conference of computational Linguistics . 2019

机译：OPUS资源存储库：用于创建并行语料库和机器翻译服务的开放包
5. Evaluating Parallel Corpora and Translation Quality for Chinese and English [D] . Wei, Liu. 2016

机译：评估汉语和英语的平行语料库和翻译质量
6. Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text [O] . Antonio Jimeno Yepes, Élise Prieur-Gaston, Aurélie Névéol 2013

机译：结合MEDLINE和发布者数据以创建并行语料库以自动翻译生物医学文本
7. Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text [O] . Antonio Jimeno Yepes, Élise Prieur-Gaston, Aurélie Névéol 2013

机译：结合MEDLINE和发布者数据以创建并行语料库，以自动翻译生物医学文本

The OPUS Resource Repository: An Open Package for Creating Parallel Corpora and Machine Translation Services

摘要

著录项

相似文献

相关主题

期刊订阅