A Practical Method for Approximate Subsequence Search in DNA Databases

机译：DNA数据库中近似子序列搜索的实用方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we propose an accurate and efficient method for approximate subsequence search in large DNA databases. The proposed method basically adopts a binary trie as its primary structure and stores all the window subsequences extracted from a DNA sequence. For approximate subsequence search, it traverses the binary trie in a breadth-first fashion and retrieves all the matched subsequences from the traversed path within the trie by a dynamic programming technique. However, the proposed method stores only window subsequences of the pre-determined length, and thus suffers from large post-processing time in case of long query sequences. To overcome this problem, we divide a query sequence into shorter pieces, perform searching for those subsequences, and then merge their results.

机译：在本文中，我们为大型DNA数据库中的近似子序列搜索提出了一种准确而有效的方法。所提出的方法基本上采用二进制特里结构作为其主要结构，并存储从DNA序列中提取的所有窗口子序列。对于近似子序列搜索，它以广度优先的方式遍历二进制trie，并通过动态编程技术从遍历路径中的trie中检索所有匹配的子序列。然而，所提出的方法仅存储预定长度的窗口子序列，因此在长查询序列的情况下遭受较大的后处理时间。为了解决这个问题，我们将查询序列分为多个较短的部分，搜索这些子序列，然后合并它们的结果。

著录项

来源
《Advances in Knowledge Discovery and Data Mining; Lecture Notes in Artificial Intelligence; 4426》|2007年|921-931|共11页
会议地点 Nanjing(CN)
作者
Jung-Im Won; Sang-Kyoon Hong; Jee-Hee Yoon; Sanghyun Park; Sang-Wook Kim;
展开▼
作者单位

College of Information and Communications Hanyang University, Korea;

Division of Information Engineering and Telecommunications Hallym University, Korea;

Department of Computer Science Yonsei University, Korea;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP311.13;
关键词
DNA database; approximate subsequence search; suffix tree;

机译：DNA数据库；近似子序列搜索；后缀树;

相似文献

外文文献
中文文献
专利

1. Similarity-based subsequence search in image sequence databases [J] . Sanghyun Park, Wesley W. Chu International Journal of Image and Graphics . 2003,第1期

机译：图像序列数据库中基于相似度的子序列搜索
2. A practical method for browsing a relational database using a standard search engine [J] . Brian Harrington, Robert Brazile, Kathleen Swigger Integrated Computer-Aided Engineering . 2009,第3期

机译：使用标准搜索引擎浏览关系数据库的实用方法
3. A practical method for browsing a relational database using a standard search engine [J] . Brian Harrington, Robert Brazile, Kathleen Swigger Integrated Computer-Aided Engineering . 2009,第3期

机译：使用标准搜索引擎浏览关系数据库的实用方法
4. Accelerating approximate subsequence search on large protein sequence databases [C] . Jiong Yang, Wei Wang, Yi Xia, . 2002

机译：加快大型蛋白质序列数据库的近似子序列搜索
5. Fast Locality Sensitive Hashing Algorithm for Approximate Nearest Neighbor Search: A Practical Data Mining Approach. [D] . Buaba, Ruben. 2012

机译：近似最近邻居搜索的快速局部敏感哈希算法：一种实用的数据挖掘方法。
6. Comparing the effectiveness of using generic and specific search terms in electronic databases to identify health outcomes for a systematic review: a prospective comparative study of literature search methods [O] . Matt Egan, Alice MacLean, Helen Sweeting, 2012

机译：比较在电子数据库中使用通用搜索词和特定搜索词来识别健康结果以进行系统评价的有效性：文献搜索方法的前瞻性比较研究
7. Accelerating Approximate Subsequence Search on Large Protein SequenceDatabases [O] . Jiong Yang, Wei Wang, Yi Xia, 2008

机译：加速大型蛋白质序列数据库的近似子序列搜索

A Practical Method for Approximate Subsequence Search in DNA Databases

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅