首页> 外文会议>International conference on computational linguistics;COLING 2010 >How to Get the Same News from Different Language News Papers
【24h】

How to Get the Same News from Different Language News Papers

机译:如何从不同语言的新闻报纸中获得相同的新闻

获取原文

摘要

This paper presents an ongoing work on identifying similarity between documents across News papers in different languages. Our aim is to identify similar documents for a given News or event as a query, across languages and make cross lingual search more accurate and easy. For example given an event or News in English, all the English news documents related to the query are retrieved as well as in other languages such as Hindi, Bengali, Tamil, Telugu, Malayalam, Spanish. We use Vector Space Model, a known method for similarity calculation, but the novelty is in identification of terms for VSM calculation. Here a robust translation system is not used for translating the documents. The system is working with good recall and precision.
机译:本文提出了一项正在进行的工作,该工作旨在识别不同语言的新闻论文中文档之间的相似性。我们的目标是跨语言识别给定新闻或事件的类似文档作为查询,并使跨语言搜索更加准确和容易。例如,如果给定一个事件或英文新闻,则检索与该查询有关的所有英文新闻文档以及其他语言,例如印地语,孟加拉语,泰米尔语,泰卢固语,马拉雅拉姆语,西班牙语。我们使用向量空间模型(Vector Space Model),这是一种用于相似度计算的已知方法,但是新颖之处在于用于VSM计算的术语的识别。在这里,不使用强大的翻译系统来翻译文档。该系统具有良好的召回性和准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号