首页> 外文会议>IEEE International Symposium on Bioinformatics and Bioengineering >Mining Frequent Contiguous Sequence Patterns in Biological Sequences

【24h】

Mining Frequent Contiguous Sequence Patterns in Biological Sequences

机译：挖掘生物序列中的常见连续序列模式

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Biological sequences such as DNA and amino acid sequences typically contain a large number of items. They have contiguous sequences that ordinarily consist of more than hundreds of frequent items. In biological sequences analysis (BSA), a frequent contiguous sequence search is one of the most important operations. Many studies have been done for mining sequential patterns efficiently. In recent years, the MacosVSpan algorithm was proposed based on the idea of the prefixSpan algorithm to significantly reduce its recursive process. However, the algorithm is inefficient for mining frequent contiguous sequences from long biological data sequences. In this paper, we propose an efficient method to mine maximal frequent contiguous sequences in large biological data sequences by constructing the spanning tree with a fixed length. To verify the superiority of the proposed method, we perform experiments in various environments. The experiments show that the proposed method is much more efficient than MacosVSpan in terms of retrieval performance.

机译：诸如DNA和氨基酸序列的生物序列通常含有大量物品。它们具有连续的序列，通常由数百个频繁的物品组成。在生物序列分析（BSA）中，频繁的连续序列搜索是最重要的操作之一。已经有效地挖掘了许多研究。近年来，基于前缀算法的思想提出了MacOSVSPAN算法，以显着减少其递归过程。然而，算法对于从长生物数据序列挖掘频繁连续序列的效率低。在本文中，我们提出了一种通过构造具有固定长度的生成树的大型生物数据序列中的最大频繁连续序列的有效方法。为了验证所提出的方法的优越性，我们在各种环境中执行实验。实验表明，在检索性能方面，所提出的方法比MacOSvspan更有效。

著录项

来源
《IEEE International Symposium on Bioinformatics and Bioengineering 》|2007年||共6页
会议地点
作者
Tae Ho Kang; Jae Soo Yoo; Hak Yong Kim;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类生物数学方法 ;
关键词
Biological Sequence Analysis; Sequencel pattern mining; Bioinformatics;

机译：生物序列分析;序列模式挖掘;生物信息学;

相似文献

外文文献
中文文献
专利

1. Frequent contiguous pattern mining over biological sequences of protein misfolded diseases [J] . Islam Mohammad Shahedul, Mia Abul Kashem, Rahman Mohammad Shamsur, BMC Bioinformatics . 2021 ,第1期

机译：频繁的蛋白质蛋白质生物序列的常见模式挖掘
2. A MapReduce Framework for Mining Maximal Contiguous Frequent Patterns in Large DNA Sequence Datasets [J] . Md. Rezaul Karim, Md. Azam Hossain, Md. Mamunur Rashid, IETE Technical Review . 2012 ,第2期

机译：一个用于在大型DNA序列数据集中挖掘最大连续频率模式的MapReduce框架
3. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases [J] . Md. Rezaul Karim, Md. Mamunur Rashid, Byeong-Soo Jeong, Genomics & Informatics . 2012 ,第1期

机译：从大型DNA序列数据库中挖掘最大连续频率模式的有效方法
4. Mining Frequent Contiguous Sequence Patterns in Biological Sequences [C] . Tae Ho Kang, Jae Soo Yoo, Hak Yong Kim IEEE International Symposium on Bioinformatics and Bioengineering . 2007

机译：挖掘生物序列中的常见连续序列模式
5. A top-down approach for mining most specific frequent patterns in biological sequence data. [D] . Zhang, Xiang. 2004

机译：自顶向下的方法，用于挖掘生物序列数据中最特定的频繁模式。
6. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases [O] . Md. Rezaul Karim, Md. Mamunur Rashid, Byeong-Soo Jeong, 2012

机译：从大型DNA序列数据库中挖掘最大连续频率模式的有效方法
7. Frequent Contiguous Pattern Mining Algorithms for Biological Data Sequences [O] . S. Rajasekaran, D Centre 2015

机译：生物数据序列的频繁连续模式挖掘算法
8. Detecting and Mining Similarities, Differences and Target Patterns in Sequences of Images Using the PFF, LGG and SPNG Approaches [R] . Bourbakis, D. 2004

机译：使用pFF，LGG和spNG方法检测和挖掘图像序列中的相似性，差异和目标模式

Mining Frequent Contiguous Sequence Patterns in Biological Sequences

摘要

著录项

相似文献

相关主题

期刊订阅