首页> 外文会议>Workshop of the Cross-Language Evaluation Forum >Using Statistical Translation Models for Bilingual IR
【24h】

Using Statistical Translation Models for Bilingual IR

机译:使用双语红外统计翻译模型

获取原文

摘要

This report describes our tests on applying statistical translation models for bilingual IR tasks in CLEF-2001. These translation models have been trained on a set of parallel web pages automatically mined from the Web. Our previous studies have shown the utility of such corpora for cross-language information retrieval. The goal of the current tests is to see how we can improve the quality of the translation models and make best uses of them. Several questions are considered: Is it useful to consider the IDF factor in addition to the translation probabilities? Is it useful to further clean the training corpora before model training or the translation models themselves? How could we combine the translation models with bilingual dictionaries? Although our tests do not allow us to answer all these questions, they provide useful indication to several further research directions.
机译:本报告介绍了我们对Clef-2001中双语红外任务应用统计翻译模型的测试。这些翻译模型已在一组并行网页上培训,自动从Web中挖掘。我们以前的研究表明,此类公司用于交流信息检索的实用性。目前测试的目标是看看我们如何提高翻译模型的质量,并充分利用它们。考虑了几个问题:除了翻译概率之外,考虑IDF因素是否有用?在模型培训或翻译模型本身之前进一步清洁培训语料是否有用?我们如何将翻译模型与双语词典结合起来?虽然我们的测试不允许我们回答所有这些问题,但它们为几个进一步的研究方向提供了有用的指示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号