...
首页> 外文期刊>Sadhana >Ontology-based Tamil–English cross-lingual information retrieval system
【24h】

Ontology-based Tamil–English cross-lingual information retrieval system

机译:基于本体的泰米尔语-英语跨语言信息检索系统

获取原文
           

摘要

Cross-lingual information retrieval (CLIR) systems facilitate users to query for information in one language and retrieve relevant documents in another language. In general, CLIR systems translate query in source language to target language and retrieve documents in target language based on the keywords present inthe translated query. However, the presence of ambiguity in source and translated queries reduces the performance of the system. Ontology can be used to address this problem. The current approaches to ontology-based CLIR systems use manually constructed multilingual ontology, which is expensive. However, many methods exist to automatically construct ontology for any domain in English but not in other languages like Tamil. We propose a methodology for Tamil–English CLIR system by translating the Tamil query to English and retrieve pages in English to address these issues. Our approach uses a word sense disambiguation module to resolve the ambiguity in Tamil query. An automatically constructed ontology in English is used to address the ambiguity of English query. We have developed a morphological analyser for Tamil language, Tamil–English bilingualdictionary and named entity database to translate a Tamil query to English. The translated query is reformulated using ontology and the reformulated queries are given to a search engine to retrieve English documents from the Internet. We have evaluated our methodology for agriculture domain and the evaluation results show that our approach outperforms other approaches in terms of precision.
机译:跨语言信息检索(CLIR)系统方便用户查询一种语言的信息并检索另一种语言的相关文档。通常,CLIR系统基于翻译后的查询中存在的关键字,将查询以源语言翻译为目标语言,并以目标语言检索文档。但是,源查询和翻译查询中存在歧义会降低系统的性能。本体可以用来解决这个问题。基于本体的CLIR系统的当前方法使用手动构建的多语言本体,这是昂贵的。但是,存在许多方法可以自动为英语的任何域构建本体,而不能为其他语言(如泰米尔语)构建本体。通过将泰米尔语查询翻译为英语并检索英语页面以解决这些问题,我们提出了泰米尔语-英语CLIR系统的方法。我们的方法使用词义消歧模块来解决泰米尔语查询中的歧义。使用自动构建的英语本体来解决英语查询的歧义。我们开发了一种针对泰米尔语,泰米尔语-英语双语词典和命名实体数据库的形态分析仪,以将泰米尔语查询翻译为英语。使用本体对翻译后的查询进行重构,并将重构后的查询提供给搜索引擎,以便从Internet检索英语文档。我们对农业领域的方法进行了评估,评估结果表明,我们的方法在准确性方面优于其他方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号