首页> 外文学位 >InforadarML: A multi-lingual information discovery tool exploiting automatic document categorization.
【24h】

InforadarML: A multi-lingual information discovery tool exploiting automatic document categorization.

机译:InforadarML:利用自动文档分类的多语言信息发现工具。

获取原文
获取原文并翻译 | 示例

摘要

In this thesis we present the design of Inforadar ML a multilingual extension for Inforadar, the first search engine supporting automatically generated visual query hierarchies. The central hypothesis of this work is that retrieval effectiveness of multilingual documents can be improved by simultaneously providing the search engine human-translated multilingual queries identified with their source languages. Inforadar ML enhances Inforadar by adding support for multilingual queries and document collections. We have developed a test collection of multilingual web documents, queries and human-generated relevance judgments freely available to the scientific community. We have conducted precision/recall experiments to assess the effectiveness of three document ranking algorithms. Our experiments suggest that automatic ranking of multilingual results sets even using naive ranking algorithms yields results comparable to independent manual sifting of separate results from equivalent queries in different languages. We feel that more efficient multilingual ranking algorithms can provide more valuable response to specific multilingual information needs.
机译:在本文中,我们提出了 Inforadar ML 的设计,它是 Inforadar 的多语言扩展,这是第一个支持自动生成可视查询层次结构的搜索引擎。这项工作的中心假设是,通过同时提供搜索引擎以其源语言标识的人工翻译多语言查询,可以提高多语言文档的检索效率。 Inforadar ML 通过添加对多语言查询和文档集合的支持来增强 Inforadar 。我们已经开发了一种多语言Web文档,查询和人工生成的相关性判断的测试集合,科学界可以免费使用这些集合。我们已经进行了精确/召回实验,以评估三种文档排名算法的有效性。我们的实验表明,即使使用幼稚的排名算法,也可以对多语言结果集进行自动排名,其结果可媲美对来自不同语言的等效查询的单独结果进行独立手动筛选。我们认为,更有效的多语言排名算法可以为特定的多语言信息需求提供更有价值的响应。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号