首页> 外文会议>ACM SIGMOD international conference on management of data >Pay-As-You-Go - An Adaptive Approach to Provide Full Context-Aware Text Search over Document Content
【24h】

Pay-As-You-Go - An Adaptive Approach to Provide Full Context-Aware Text Search over Document Content

机译:支付AS-You-go - 提供完整的上下文感知文本搜索文档内容的自适应方法

获取原文

摘要

RDBMS provides best performance for querying structured data that starts out with a well-defined schema. However, such a 'schema first, data later' approach does not work for unstructured data or data without much structure. Therefore, RDBMS typically stores such data without any schema in LOB columns (for example, Character Large Object (CLOB) or Binary Large Object (BLOB) columns) and provides Information-Retrieval (IR) style, keyword-based search capability over these LOB columns. Lately, XML as a native datatype (XMLType) in RDBMS has been introduced via the SQL/XML standard. Semi-structured data with or without any schema can be stored into such XMLType columns, and XQuery provides query capability over them. In particular, XQuery full text specification provides the capability of searching keywords within document context. Such full context-aware text search capability is more powerful than pure keyword search, since the user can now provide fine-grained context in which the keywords should occur. However, XML with XQuery full text searching requires that the user first convert her text data into XML and store them into XMLType column. Such massive physical data migration with possible loss of document fidelity and its potential impact on existing production environments are often expensive enough that users are reluctant to adopt the XML/XQuery approach. In this paper, we propose a pay-as-you-go architecture to provide XML text view over LOB columns, so that user can take advantage of context-aware full-text search capability adoptively. This adaptive architecture includes a novel XML text index that can be created over the LOB column where the content is stored. The XML text index supports an XML text view over LOB data on top of which XQuery full-text search capability is feasible. Such an adaptive index/view approach provides least intrusion over existing data, as it requires no physical data migration. We describe the design and challenge of building such an adaptive XML text index. Furthermore, we advocate that the pay-as-you-go approach provides the integration bridge between the structured relational world and text oriented document world and fulfils the primary motivation of XML in the database.
机译:RDBMS为查询初始定义的架构启动的结构化数据提供最佳性能。但是,这样的“架构首先,数据稍后”方法不适用于非结构化数据或数据而无需多大结构。因此,RDBMS通常在没有LOB列中的任何模式(例如,字符大对象(CLOB)或二进制大对象(BLOB)列)的情况下存储此类数据,并提供这些LOB的信息检索(IR)样式,基于关键字的搜索功能列。最近,通过SQL / XML标准引入了RDBMS中的XML作为本机数据类型(XMLType)。具有或没有任何模式的半结构化数据可以存储到此类XMLType列中,XQuery提供了对它们的查询功能。特别是,XQuery全文规范提供了在文档上下文中搜索关键字的功能。这种完整的上下文感知文本搜索能力比纯关键字搜索更强大,因为用户现在可以提供应发生关键字的细粒度上下文。但是,具有XQuery全文搜索的XML要求用户首先将其文本数据转换为XML并将其存储到XMLType列中。这种大规模物理数据迁移,可能丢失了文件保真度及其对现有生产环境的潜在影响通常足够昂贵,以便用户不愿意采用XML / XQuery方法。在本文中,我们提出了一种支付的支付架构,可以通过LOB列提供XML文本视图,以便用户可以采用上下文感知全文搜索能力。该自适应架构包括一种新颖的XML文本索引,可以通过存储内容的LOB列创建。 XML Text index支持XML文本视图上的XQuery全文搜索功能是可行的。这种自适应索引/视图方法提供了对现有数据的最小侵扰,因为它不需要物理数据迁移。我们描述了建立这种自适应XML文本索引的设计和挑战。此外,我们倡导您的“付费工资”方法在结构化关系世界和文本导向文档世​​界之间提供了集成桥梁,并满足了数据库中XML的主要动机。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号