首页> 外文会议>IEEE International Conference on Research, Innovation and Vision for the Future >English-Vietnamese Cross-Language Information Retrieval: An Experimental Study
【24h】

English-Vietnamese Cross-Language Information Retrieval: An Experimental Study

机译:英语 - 越南语跨语言信息检索:实验研究

获取原文
获取外文期刊封面目录资料

摘要

Translation units play an important role in Cross-Language Information Retrieval (CLIR) involved with languages lacking word boundary - such as Chinese and Vietnamese. While the impact of these units has been well studied in English-Chinese CLIR, it has not been considered in English-Vietnamese CLIR. Therefore, in this paper, we examine how different translation units contribute to the English-Vietnamese CLIR performance. Similar to translation units, different translation resources tend to have different effects on CLIR. Thus, the use of these linguistic resources should also be investigated. Since bilingual dictionary is the only resource available, we have constructed an English-Vietnamese parallel corpus using automatic system based on web mining. We then experiment three methods to exploit these resources - using the dictionary only, the parallel corpus only, and the parallel corpus to disambiguate translations given by the dictionary. Our experiments have shown that automatically generated parallel corpus is very feasible for both translation and disambiguation in English-Vietnamese CLIR.
机译:翻译单位在涉及缺乏词语边界的语言的跨语言信息检索(CLIR)中发挥着重要作用 - 例如中国和越南语。虽然这些单位的影响在英汉CLIR中得到了很好的研究,但它尚未在英语 - 越南CLIR中考虑。因此,在本文中,我们研究了不同的翻译单位如何为英语 - 越南CLIR表现做出贡献。类似于翻译装置,不同的翻译资源往往对CLIR产生不同的影响。因此,还应调查这些语言资源的使用。由于双语词典是唯一可用的资源,我们使用基于Web挖掘的自动系统构建了英语平行语料库。然后,我们尝试三种方法来利用这些资源 - 仅使用所述字典,仅限并行语料库,以及并行语料库来消除字典给出的翻译。我们的实验表明,在英语 - 越南CLIR中的翻译和消歧,自动生成的并行语料库是非常可行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号