首页> 外文学位 >InforadarML: A multi-lingual information discovery tool exploiting automatic document categorization.

【24h】

InforadarML: A multi-lingual information discovery tool exploiting automatic document categorization.

机译：InforadarML：利用自动文档分类的多语言信息发现工具。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this thesis we present the design of Inforadar ML a multilingual extension for Inforadar, the first search engine supporting automatically generated visual query hierarchies. The central hypothesis of this work is that retrieval effectiveness of multilingual documents can be improved by simultaneously providing the search engine human-translated multilingual queries identified with their source languages. Inforadar ML enhances Inforadar by adding support for multilingual queries and document collections. We have developed a test collection of multilingual web documents, queries and human-generated relevance judgments freely available to the scientific community. We have conducted precision/recall experiments to assess the effectiveness of three document ranking algorithms. Our experiments suggest that automatic ranking of multilingual results sets even using naive ranking algorithms yields results comparable to independent manual sifting of separate results from equivalent queries in different languages. We feel that more efficient multilingual ranking algorithms can provide more valuable response to specific multilingual information needs.

机译：在本文中，我们提出了 Inforadar ML 的设计，它是 Inforadar 的多语言扩展，这是第一个支持自动生成可视查询层次结构的搜索引擎。这项工作的中心假设是，通过同时提供搜索引擎以其源语言标识的人工翻译多语言查询，可以提高多语言文档的检索效率。 Inforadar ML 通过添加对多语言查询和文档集合的支持来增强 Inforadar 。我们已经开发了一种多语言Web文档，查询和人工生成的相关性判断的测试集合，科学界可以免费使用这些集合。我们已经进行了精确/召回实验，以评估三种文档排名算法的有效性。我们的实验表明，即使使用幼稚的排名算法，也可以对多语言结果集进行自动排名，其结果可媲美对来自不同语言的等效查询的单独结果进行独立手动筛选。我们认为，更有效的多语言排名算法可以为特定的多语言信息需求提供更有价值的响应。

著录项

作者
Valiente-Fernandez, Jairo E.;
展开▼
作者单位

University of Puerto Rico, Mayaguez (Puerto Rico).;

展开▼
授予单位 University of Puerto Rico, Mayaguez (Puerto Rico).;
学科 Computer Science.; Information Science.
学位 M.S.
年度 2003
页码 55 p.
总页数 55
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;信息与知识传播;
关键词

相似文献

外文文献
中文文献
专利

1. Multi-lingual date field extraction for automatic document retrieval by machine [J] . Mandal Ranju, Roy Partha Pratim, Pal Umapada, Information Sciences: An International Journal . 2015,第Null期

机译：多语言日期字段提取，可通过机器自动检索文档
2. Automatic topics discovery from hyperlinked documents [J] . Wu KJ., Chen MC., Sun YL. Information Processing & Management . 2004,第2期

机译：通过超链接文档自动发现主题
3. Discovery and representation of the preferences of automatically detected groups: Exploiting the link between group modeling and clustering [J] . Ludovico Boratto, Salvatore Carta, Gianni Fenu Future generation computer systems . 2016,第nova期

机译：发现和表示自动检测到的组的首选项：利用组建模和聚类之间的链接
4. INFORADAR-CL: A CROSS-LINGUAL INFORMATION DISCOVERY TOOL EXPLOITING AUTOMATIC DOCUMENT CATEGORIZATION [C] . Jairo E. Valiente-Fernandez, Bienvenido J. Velez-Rivera Information and Knowledge Sharing . 2002

机译：INFORADAR-CL：跨语言信息发现工具，用于自动文档归类
5. The implementation of dynamic document organization using the integration of text clustering and text categorization. [D] . Jo, Taeho. 2006

机译：使用文本聚类和文本分类的集成来实现动态文档组织。
6. SITC/iSBTc Cancer Immunotherapy Biomarkers Resource Document: Online resources and useful tools - a compass in the land of biomarker discovery [O] . Davide Bedognetti, James M Balwit, Ena Wang, 2011

机译：SITC / iSBTc癌症免疫疗法生物标志物资源文件：在线资源和有用工具-生物标志物发现领域的指南针
7. Multi-lingual date field extraction for automatic document retrieval by machine [O] . Mandal Ranju, Roy Partha Pratim, Pal Umapada, 2015

机译：多语言日期字段提取，可通过机器自动检索文档

InforadarML: A multi-lingual information discovery tool exploiting automatic document categorization.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅