An Efficient Index Structure for String Databases

机译：字符串数据库的有效索引结构

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We consider the problem of substring searching in large databases. Typical applications of this problem are genetic data, web data, and event sequences. Since the size of such databases grows exponentially, it becomes impractical to use in-memory algorithms for these problems. In this paper, we propose to map the substrings of the data into an integer space with the help of wavelet coefficients. Later, we index these coefficients using MBRs (Minimum Bounding Rectangles). We define a distance function which is a lower bound to the actual edit distance between strings. We experiment with both nearest neighbor queries and range queries. The results show that our technique prunes significant amount of the database (typically 50-95%), thus reducing both the disk I/O cost and the CPU cost significantly.

机译：我们考虑大型数据库中子字符串搜索的问题。此问题的典型应用是遗传数据，Web数据和事件序列。由于此类数据库的大小呈指数增长，因此针对这些问题使用内存中算法变得不切实际。在本文中，我们建议借助小波系数将数据的子字符串映射到整数空间中。之后，我们使用MBR（最小边界矩形）对这些系数进行索引。我们定义一个距离函数，该距离是字符串之间实际编辑距离的下限。我们尝试了最近邻居查询和范围查询。结果表明，我们的技术修剪了大量的数据库（通常为50-95％），从而显着降低了磁盘I / O成本和CPU成本。

著录项

来源
《Twenty-Seventh International Conference on Very Large Data Bases, 27th, Sep 11-14th, 2001, Roma, Italy》|2001年|p.351-360|共10页
会议地点 Roma(IT);Roma(IT)
作者
Tamer Kahveci; Ambuj K. Singh;
展开▼
作者单位

Department of Computer Science, University of California Santa Barbara, CA 93106;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A space efficient solution to the frequent string mining problem for many databases [J] . Adrian Kügel, Enno Ohlebusch Data Mining and Knowledge Discovery . 2008,第1期

机译：一种空间高效的解决方案，用于解决许多数据库的频繁字符串挖掘问题
2. A space efficient solution to the frequent string mining problem for many databases [J] . Kugel A, Ohlebusch E Data mining and knowledge discovery . 2008,第1期

机译：一种空间高效的解决方案，用于解决许多数据库的频繁字符串挖掘问题
3. A hybrid index structure for querying large string databases [J] . Qiang Xue, Sakti Pramanik, Gang Qiang, International Journal of Electronic Business . 2005,第3a4期

机译：用于查询大型字符串数据库的混合索引结构
4. An Efficient Index Structure for String Databases [C] . Ambuj K. Singh, Tamer Kahveci International conference on very large data bases . 2001

机译：字符串数据库有效的索引结构
5. Space efficient string search algorithms and data structures. [D] . Deoghare, Pratik. 2015

机译：空间高效的字符串搜索算法和数据结构。
6. Preparation of Activated Carbon Supported Bead String Structure Nano Zero Valent Iron in a Polyethylene Glycol-Aqueous Solution and Its Efficient Treatment of Cr(VI) Wastewater [O] . Chunlei Jiao, Xiao Tan, Aijun Lin, 2020

机译：聚乙二醇水溶液中活性炭负载珠串结构纳米零价铁的制备及其对六价铬废水的有效处理
7. INSTRUCT: Space-Efficient Structure for Indexing and Complete Query Management of String Databases [O] . Dutta, Sourav, Bhattacharya, Arnab 2012

机译：INsTRUCT：用于索引和完整查询的节省空间的结构字符串数据库的管理
8. Efficient bit string implementation of a database cross-field association system (with an application to protein sequence patterns) [R] . Guigo, R, Vazquez, I, Smith, T F 1992

机译：数据库跨域关联系统的高效位串实现（应用于蛋白质序列模式）

An Efficient Index Structure for String Databases

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅