首页> 外文期刊>ACM transactions on Asian language information processing >Chinese OOV Translation and Post-translation Query Expansion in Chinese-English Cross-lingual Information Retrieval
【24h】

Chinese OOV Translation and Post-translation Query Expansion in Chinese-English Cross-lingual Information Retrieval

机译:汉英跨语言信息检索中的中文OOV翻译和翻译后查询扩展

获取原文
获取原文并翻译 | 示例
       

摘要

Cross-lingual information retrieval allows users to query mixed-language collections or to probe for documents written in an unfamiliar language. A major difficulty for cross-lingual information retrieval is the detection and translation of out-of-vocabulary (OOV) terms; for OOV terms in Chinese, another difficulty is segmentation. At NTCIR-4, we explored methods for translation and disambiguation for OOV terms when using a Chinese query on an English collection. We have developed a new segmentation-free technique for automatic translation of Chinese OOV terms using the web. We have also investigated the effects of distance factor and window size when using a hidden Markov model to provide disambiguation. Our experiments show these methods significantly improve effectiveness; in conjunction with our post-translation query expansion technique, effectiveness approaches that of monolingual retrieval.
机译:跨语言信息检索允许用户查询混合语言集合或探查以不熟悉的语言编写的文档。跨语言信息检索的主要困难是词汇外(OOV)术语的检测和翻译;对于中文的OOV术语,另一个困难是分割。在NTCIR-4,我们探讨了在英语馆藏中使用中文查询时OOV术语的翻译和歧义消除方法。我们开发了一种新的无分段技术,可以使用网络自动翻译中文OOV术语。我们还研究了使用隐马尔可夫模型消除歧义时距离因子和窗口大小的影响。我们的实验表明,这些方法可显着提高有效性。结合我们的翻译后查询扩展技术,有效性接近于单语检索。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号