首页> 外文学位 >Database selection in distributed information retrieval: A study of multi-collection information retrieval.

【24h】

Database selection in distributed information retrieval: A study of multi-collection information retrieval.

机译：分布式信息检索中的数据库选择：多馆藏信息检索的研究。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The proliferation of online information resources increases the importance of effective and efficient information retrieval in a multi-collection environment. Multi-collection searching includes distributed searching as a special case but is more broadly defined here to incorporate searching partitioned content independently from its physical storage. It is cast in three parts: collection selection (also referred to as database selection)—decide where should a query be sent; query processing—execute the query at each selected collection; and results merging—combine the results from individual collections into a single coherent list for the searcher. We focus our attention on collection selection.; We compare a number of different collection selection approaches and examine the effect of collection selection on document retrieval performance. We consider multi-collection retrieval in six different test environments utilizing three document testbeds. Considering collection selection in isolation, we find that effective collection selection can be achieved using limited information about each collection. We then turn our attention from selection alone to data item retrieval in a multi-collection environment, considering retrieval performance in the same six test environments. First we find that good collection selection has the potential to result in better retrieval effectiveness than can be achieved in an equivalent single collection. Second we find that good performance can be achieved when only a few collections are selected and that the performance generally increases as more collections are selected. Finally we find that when collection selection is employed, it may not be necessary to maintain collection wide information (CWI), e.g., global idf. Local information can be used to achieve equivalent performance. This means that multi-collection systems can be engineered with more autonomy and less cooperation. This work demonstrates that improvements in collection selection can lead to broader improvements in document retrieval performance.

机译：在线信息资源的激增增加了在多馆藏环境中有效进行信息检索的重要性。多集合搜索在特殊情况下包括分布式搜索，但在此更广义地定义为独立于其物理存储而合并搜索分区内容。它分为三个部分：集合选择（也称为数据库选择）-决定将查询发送到哪里；查询处理-在每个选定的集合处执行查询；和结果合并-将单个集合的结果合并到搜索者的单个一致列表中。我们将注意力集中在收藏选择上。我们比较了许多不同的馆藏选择方法，并研究了馆藏选择对文档检索性能的影响。我们考虑在六个不同的测试环境中利用三个文档测试平台进行多集合检索。单独考虑集合选择，我们发现可以使用关于每个集合的有限信息来实现有效的集合选择。然后，我们考虑在相同的六个测试环境中的检索性能，将注意力从单独选择转移到在多集合环境中进行数据项检索。首先，我们发现，与等效的单个集合相比，良好的集合选择有可能导致更好的检索效率。其次，我们发现仅选择几个集合即可实现良好的性能，并且随着选择更多的集合，性能通常会提高。最终，我们发现，当采用集合选择时，可能不必维护集合范围信息（CWI），例如全局IDF。本地信息可用于实现同等的性能。这意味着可以以更大的自主权和更少的协作来设计多重收集系统。这项工作表明，馆藏选择的改进可以导致文档检索性能的更广泛的改进。

著录项

作者
Powell, Allison Lane.;
展开▼
作者单位

University of Virginia.;

展开▼
授予单位 University of Virginia.;
学科 Computer Science.
学位 Ph.D.
年度 2001
页码 250 p.
总页数 250
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. On the use of temporal series of L-and X-band SAR data for soil moisture retrieval. Capitanata plain case study [J] . Francesco Lovergine, Anna Balenzano, Giuseppe Satalino, European Journal of Remote Sensing . 2013,第1期

机译：关于使用L波段和X波段SAR数据的时间序列进行土壤水分反演的研究。 Capitanata平原案例研究
2. On the use of temporal series of L-and X-band SAR data for soil moisture retrieval. Capitanata plain case study [J] . Francesco Lovergine, Anna Balenzano, Giuseppe Satalino, European Journal of Remote Sensing . 2013,第1期

机译：关于使用L波段和X波段SAR数据的时间序列进行土壤水分反演的研究。 Capitanata平原案例研究
3. Ecphory of autobiographical memories: an fMRI study of recent and remote memory retrieval. [J] . Steinvorth S, Corkin S, Halgren E NeuroImage . 2006,第1期

机译：自传记忆的兴起：最近和远程记忆检索的功能磁共振成像研究。
4. A GEOMETRICAL FUZZY PARTITIONS APPROACH TO FUZZY QUERY AND FUZZY DATABASE RETRIEVAL. [C] . E. G. Mtalo, E. Derenyi International Society for Photogrammetry and Remote Sensing Congress . 2009

机译：模糊查询和模糊数据库检索的几何模糊分区方法。
5. Effect of different database structure representations, query languages, and user characteristics on information retrieval. [D] . Bizarro, Pascal Antoine. 2003

机译：不同数据库结构表示形式，查询语言和用户特征对信息检索的影响。
6. A cholinesterase genes server (ESTHER): a database of cholinesterase-related sequences for multiple alignments phylogenetic relationships mutations and structural data retrieval. [O] . X Cousin, T Hotelier, P Liévin, 1996

机译：胆碱酯酶基因服务器（ESTHER）：与胆碱酯酶相关的序列的数据库用于多种比对系统发育关系突变和结构数据检索。
7. Distributed Knowledge Base Systems for Diagnosis and Information Retrieval. [O] . B. Chandrasekaran 1984

机译：分布式知识库系统，用于诊断和信息检索。
8. Sparse Representation of Multimodality Sensing Databases for Data Mining and Retrieval. [R] . Hero, A. O., Savarese, S. 2015

机译：用于数据挖掘和检索的多模态传感数据库的稀疏表示。

Database selection in distributed information retrieval: A study of multi-collection information retrieval.

摘要

著录项

相似文献

相关主题

期刊订阅