首页> 外文学位 >Integrating text search and relational databases: Functionality and performance.

【24h】

Integrating text search and relational databases: Functionality and performance.

机译：集成文本搜索和关系数据库：功能和性能。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Applications increasingly involve a mix of free-text documents and traditional relational tables [46]. Commercial relational database management system (RDBMS) store both types of data and support access through keyword search, traditional relational operators in SQL, or a mixed query that combines both. However, application developers lack tools that address functionality and performance concerns that are available for traditional, scalar data, but needed when integrating keyword search in an RDBMS. With regards to functionality, this thesis proposes TextViews as a fully declarative way to specify virtual collections of virtual documents for use with keyword search. For performance, this thesis proposes TEXTURE, a benchmark for comparing RDBMSs given a workload of mixed queries.;Current RDBMSs store a document as a single attribute value and a single collection in a table. TextViews are an adaptation of relational views for defining documents that are composed of multiple documents, possibly stored in multiple tables. Such documents are grouped into a collection and ranked using keyword search. Keyword search can be evaluated by either materializing the TextView, then searching, or by using inverted indexes built on the base table. Inverted indexes do not take advantage of the scalar attributes used in selection and grouping operations that are specified in TextView definitions. Consequently, we propose several alternative indexes for which we demonstrate an order of magnitude improvement in response time for keyword search, with a modest increase in storage when compared to inverted indexes.;The TEXTURE benchmark [28] compares RDBMSs by measuring the response time needed to evaluate a workload of mixed queries. A micro-benchmark design is used to allow fine-grained control for specifying the query workload and data set. In order to support database scale up experiments, TextGen, a novel synthetic text generator was developed and evaluated. TextGen is unique in that it is capable of accurately scaling up an input "seed" text collection, while preserving important data characteristics. The TEXTURE benchmark was used to evaluate three commercial RDBMSs, demonstrating large differences between them for a variety of workloads.

机译：应用程序越来越多地包含自由文本文档和传统关系表的混合[46]。商业关系数据库管理系统（RDBMS）可以存储两种类型的数据，并通过关键字搜索，SQL中的传统关系运算符或结合了两者的混合查询来支持访问。但是，应用程序开发人员缺乏能够解决传统标量数据可用的功能和性能问题的工具，但是在将关键字搜索集成到RDBMS中时却需要这些工具。关于功能，本文提出TextViews作为一种完全声明性的方式来指定用于关键字搜索的虚拟文档的虚拟集合。为了提高性能，本文提出了TEXTURE，这是在给定混合查询工作量的情况下比较RDBMS的基准。当前的RDBMS将文档存储为单个属性值和单个集合在表中。 TextView是关系视图的一种改编，用于定义由多个文档（可能存储在多个表中）组成的文档。此类文档被分组为一个集合，并使用关键字搜索进行排名。可以通过实例化TextView然后进行搜索，或者使用在基表上建立的反向索引来评估关键字搜索。倒排索引不利用TextView定义中指定的选择和分组操作中使用的标量属性。因此，我们提出了几种可供选择的索引，针对这些索引，我们证明了关键词搜索的响应时间提高了一个数量级，与反向索引相比，其存储量有所增加。； TEXTURE基准测试[28]通过测量所需的响应时间来比较RDBMS评估混合查询的工作量。使用微基准设计可进行细粒度控制，以指定查询工作负载和数据集。为了支持数据库扩展实验，开发并评估了新型合成文本生成器TextGen。 TextGen的独特之处在于，它能够准确扩大输入的“种子”文本集合，同时保留重要的数据特征。 TEXTURE基准用于评估三种商业RDBMS，表明它们在各种工作负载之间的巨大差异。

著录项

作者
Ercegovac, Vuk.;
展开▼
作者单位

The University of Wisconsin - Madison.;

展开▼
授予单位 The University of Wisconsin - Madison.;
学科 Computer Science.
学位 Ph.D.
年度 2006
页码 135 p.
总页数 135
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Uncovering Text-Music Connections with a Relational Database: Towards an Objective Measurement of Melodic Pitch Diversity in Relation to Literary Themes in Bach's Church Cantata Recitatives [J] . MELVIN UNGER Computers and the Humanities . 2004,第3期

机译：用关系数据库发现文本-音乐联系：旨在客观地衡量与巴赫教会合唱诵读中的文学主题有关的旋律音高变化
2. ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files [J] . Konrad Büssow, Steve Hoffmann, Volker Sievert BMC Bioinformatics . 2002,第1期

机译：ORFer –从GenBank中检索蛋白质序列和开放阅读框，并存储到关系数据库或文本文件中
3. Atlas: a nested relational database system for text applications [J] . Sacks-Davis R., Kent A. IEEE Transactions on Knowledge and Data Engineering . 1995,第3期

机译：Atlas：用于文本应用程序的嵌套关系数据库系统
4. Using Functional Dependencies in Conversion of Relational Databases to Graph Databases [C] . Youmna A. Megid, Neamat El-Tazi, Aly Fahmy International conference on database and expert systems applications;International workshop on big data mamagement in cloud systems;International workshop on biological knowledge discovery;International workshop on technologies for information retrieval . 2018

机译：在关系数据库到图形数据库的转换中使用功能依赖
5. Information-seeking behavior on the World Wide Web: Effects of cognitive style, online database search experience and task types on search performance. [D] . Kim, Kyung-Sun. 1998

机译：万维网上的信息寻求行为：认知风格，在线数据库搜索体验和任务类型对搜索性能的影响。
6. A computerized representation of a medical school curriculum: integration of relational and text management software in database design. [O] . W. D. Mattern, J. A. Wagner, J. S. Brown, 1991

机译：医学院课程的计算机化表示：在数据库设计中集成关系和文本管理软件。
7. Database-integrated genome screening (DIGS): exploring genomes heuristically using sequence similarity search tools and a relational database [O] . Henan Zhu, Tristan Dennis, Joseph Hughes, 2018

机译：数据库 - 集成基因组筛选（DIGS）：使用序列相似性搜索工具和关系数据库启发出来的基因组

Integrating text search and relational databases: Functionality and performance.

摘要

著录项

相似文献

相关主题

期刊订阅