首页> 外文期刊>Machine translation >The hare and the tortoise: speed and accuracy in translation retrieval
【24h】

The hare and the tortoise: speed and accuracy in translation retrieval

机译:野兔和乌龟:翻译检索的速度和准确性

获取原文
获取原文并翻译 | 示例
       

摘要

This research looks at the effects of segment order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both bag-of-words and segment-order-sensitive string comparison methods, and test each over character-based and word-based indexing using n-grams of various orders. To evaluate accuracy, we propose an automatic method which identifies the target-language string(s) which would lead to the optimal translation for a given input, based on analysis of the held-out translation and the current contents of the translation memory. Our results indicate that character-based indexing is superior to word-based indexing, and also that bag-of-words methods are equivalent to segment-order-sensitive methods in terms of accuracy but vastly superior in terms of retrieval speed, suggesting that word segmentation and segment-order sensitivity are unnecessary luxuries for translation retrieval.
机译:这项研究着眼于分段顺序和分段对实验日语-英语翻译记忆系统的翻译检索性能的影响。我们同时实现了多种词袋和段顺序敏感的字符串比较方法,并使用各种顺序的n-gram对基于字符和基于词的索引进行测试。为了评估准确性,我们提出了一种自动方法,该方法基于对保留的翻译和翻译记忆库的当前内容的分析,识别将导致给定输入最佳翻译的目标语言字符串。我们的结果表明,基于字符的索引优于基于单词的索引,并且单词袋方法在准确性方面等效于分段顺序敏感方法,但在检索速度方面则非常优越,这表明该单词分段和分段顺序敏感性对于翻译检索来说是不必要的奢侈品。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号