The Challenges of Optimizing Machine Translation for Low Resource Cross-Language Information Retrieval

机译：优化低资源交通信息检索机器翻译的挑战

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

When performing cross-language information retrieval (CLIR) for lower-resourced languages, a common approach is to retrieve over the output of machine translation (MT). However, there is no established guidance on how to optimize the resulting MT-IR system. In this paper, we examine the relationship between the performance of MT systems and both neural and term frequency-based IR models to identify how CLIR performance can be best predicted from MT quality. We explore performance at varying amounts of MT training data, byte pair encoding (BPE) merge operations, and across two IR collections and retrieval models. We find that the choice of IR collection can substantially affect the predictive power of MT tuning decisions and evaluation, potentially introducing dissociations between MT-only and overall CLIR performance.

机译：在执行较低资源语言的跨语言信息检索（CLIR）时，通常的方法是在机器翻译（MT）的输出上检索。但是，没有建立关于如何优化生成的MT-IR系统的指导。在本文中，我们研究了MT系统性能与神经和术语频率的IR模型之间的关系，以识别如何从MT质量预测CLIR性能。我们以不同数量的MT培训数据，字节对编码（BPE）合并操作探讨了性能，并跨两个IR集合和检索模型。我们发现IR集合的选择可以大大影响MT调整决策和评估的预测力，可能引入MT-oilm与整体CLIR性能之间的解剖。

著录项

来源
《International joint conference on natural language processing》|2019年|cxxxviii p. 3235-3881|共6页
会议地点
作者
Constantine Lignos; Daniel Cohen; Yen-Chieh Lien; Pratik Mehta; W. Bruce Croft; Scott Miller;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. A learning to rank approach for cross-language information retrieval exploiting multiple translation resources [J] . Azarbonyad Hosein, Shakery Azadch, Faili Hcshaam Natural language engineering . 2019,第PTa3期

机译：一种利用多种翻译资源进行跨语言信息检索的分级学习方法
2. A learning to rank approach for cross-language information retrieval exploiting multiple translation resources [J] . Azarbonyad Hosein, Shakery Azadch, Faili Hcshaam Natural language engineering . 2019,第PTa3期

机译：一种学习对跨语言信息检索的方法，用于利用多重翻译资源
3. Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval [J] . FERHAN TURE, JIMMY LIN ACM Transactions on Information Systems . 2014,第4期

机译：利用统计机器翻译中的表示形式进行跨语言信息检索
4. The Challenges of Optimizing Machine Translation for Low Resource Cross-Language Information Retrieval [C] . Constantine Lignos, Daniel Cohen, Yen-Chieh Lien, International joint conference on natural language processing;Conference on empirical methods in natural language processing . 2019

机译：低资源跨语言信息检索中优化机器翻译的挑战
5. Non-Traditional Resources and Improved Tools for Low-Resource Machine Translation [D] . Pourdamghani, Nima. 2019

机译：非传统资源和低资源机器翻译的改进工具
6. Machine Translation-Supported Cross-Language Information Retrieval for a Consumer Health Resource [O] . Graciela Rosemblat, Darren Gemoets, Allen C. Browne, 2003

机译：消费者健康资源的机器翻译支持的跨语言信息检索
7. The Challenges of Optimizing Machine Translation for Low Resource Cross-Language Information Retrieval [O] . Constantine Lignos, Daniel Cohen, Yen-Chieh Lien, 2019

机译：优化低资源交通信息检索机器翻译的挑战
8. Construction of a Chinese-English Verb Lexicon for Embedded Machine Translation in Cross-Language Information Retrieval [R] . Dorr, B. J. , Lin, D. , Levow, G. 2002

机译：跨语言信息检索中嵌入式机器翻译的汉英动词词库构建

The Challenges of Optimizing Machine Translation for Low Resource Cross-Language Information Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅