首页> 外文会议>IEEE International Conference on Internet of Things and Intelligence System >Adapting Google Translate using Dictionary and Word Embedding for Arabic-Indonesian Cross-lingual Information Retrieval
【24h】

Adapting Google Translate using Dictionary and Word Embedding for Arabic-Indonesian Cross-lingual Information Retrieval

机译:适应谷歌翻译使用字典和单词嵌入阿拉伯语 - 印度尼西亚交叉语言信息检索

获取原文

摘要

The translation has an essential role in Cross-lingual Information Retrieval. Translation using a dictionary is reliable even though it has a limited vocabulary. Translation using google translate, in some cases, using different words used in document target words. The translation process causes word translation to be less accurate to get relevant documents. In this paper, we proposed a new translation approach by adapting google translate using a dictionary and word embedding in Arabic-Indonesian Cross-lingual Information Retrieval. The dictionary is the primary resource used for translation improved by Levenshtein distance and FastText for finding the correct word translation. Google translate is used to complete translation when the word does not exist in the dictionary resource. The proposed method archive a BLEU score of 0.47. This score is higher than the other comparison resource score. The proposed method successfully improves the translated query to retrieve more relevant documents in cross-lingual information retrieval based on this implementation.
机译:翻译在交叉语言检索中具有重要作用。使用字典的翻译即使它有一个有限的词汇表也是可靠的。在某些情况下,使用谷歌翻译的翻译在某些情况下使用文档目标单词中使用的不同单词。翻译过程导致字翻译不太准确以获得相关文件。在本文中,我们通过在阿拉伯语 - 印度尼西亚交叉语言信息检索中调整谷歌翻译,提出了一种新的翻译方法。字典是由Levenshtein距离和FastText用于查找正确的单词翻译的主要资源。谷歌翻译用于在字典资源中不存在单词时要完成翻译。所提出的方法存档BLEU得分为0.47。该分数高于其他比较资源分数。该方法成功改进了转换的查询,以根据该实现,在交叉语言信息检索中检索更多相关文档。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号