Protein Sequence Pattern Mining with Constraints

机译：有约束的蛋白质序列模式挖掘

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Considering the characteristics of biological -sequence databases, which typically have a small alphabet, a very long length and a relative small size (several hundreds of sequences), we propose a new sequence mining algorithm (gIL). gIL was developed for linear sequence pattern mining and results from the combination of some of the most efficient techniques used in sequence and itemset mining. The algorithm exhibits a high adaptability, yielding a smooth and direct introduction of various types of features into the mining process, namely the extraction of rigid and arbitrary gap patterns. Both breadth or a depth first traversal are possible. The experimental evaluation, in synthetic and real life protein databases, has shown that our algorithm has superior performance to state-of-the art algorithms. The use of constraints has also proved to be a very useful tool to specify user interesting patterns.

机译：考虑到生物序列数据库的特征，通常具有较小的字母，非常长的长度和相对较小的大小（数百个序列），我们提出了一种新的序列挖掘算法（gIL）。 gIL是为线性序列模式挖掘而开发的，它是由序列和项集挖掘中使用的一些最有效技术的组合得出的。该算法具有很高的适应性，可以将各种类型的特征平稳，直接地引入到挖掘过程中，即提取刚性和任意间隙模式。宽度或深度优先遍历都是可能的。在合成和现实生活中的蛋白质数据库中进行的实验评估表明，我们的算法比最先进的算法具有更好的性能。约束的使用也已证明是指定用户感兴趣的模式的非常有用的工具。

著录项

来源
《European Conference on Principles and Practice of Knowledge Discovery in Databases(PKDD 2005); 20051003-07; Porto(PT)》|2005年|P.96-107|共12页
会议地点 Porto(PT)
作者
Pedro Gabriel Ferreira; Paulo J. Azevedo;
展开▼
作者单位

University of Minho, Department of Informatics, Campus of Gualtar, 4710-057 Braga, Portugal;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. PMBC: Pattern mining from biological sequences with wildcard constraints [J] . WuX., ZhuX., HeY., Computers in Biology and Medicine . 2013,第5期

机译：PMBC：从具有通配符约束的生物序列中进行模式挖掘
2. PMBC: Pattern mining from biological sequences with wildcard constraints [J] . WuX., ZhuX., HeY., Computers in Biology and Medicine . 2013,第5期

机译：PMBC：从生物序列与通配符约束的模式开采
3. Mining DNA Sequence Patterns with Constraints Using Hybridization of Firefly and Group Search Optimization [J] . KuruvaLakshmanna, NeeluKhare Journal of Intelligent Systems . 2018,第3期

机译：采用使用萤火虫杂交和组搜索优化的限制的挖掘DNA序列模式
4. Prediction of protein disordered regions in a protein sequence based on gap-constraint subsequence patterns [C] . Meijing Li, Xiuming Yu, Taewook Kim, The 4th International Conference on Awareness Science and Technology. . 2012

机译：基于空位约束子序列模式的蛋白质序列中蛋白质无序区的预测
5. Mining High Utility Sequential Patterns from Uncertain Web Access Sequences using the PL-WAP [D] . Vangala, Sravya. 2017

机译：使用PL-WAP从不确定的Web访问序列中挖掘高实用程序顺序模式
6. Genome-wide patterns of nucleotide substitution reveal stringent functional constraints on the protein sequences of thermophiles. [O] . Robert Friedman, John W Drake, Austin L Hughes 2004

机译：核苷酸取代的全基因组模式揭示了嗜热菌蛋白序列的严格功能限制。
7. Protein Sequence Pattern Mining with Constraints [O] . Pedro Gabriel, Ferreira Paulo, J. Azevedo 2014

机译：基于约束的蛋白质序列模式挖掘
8. Detecting and Mining Similarities, Differences and Target Patterns in Sequences of Images Using the PFF, LGG and SPNG Approaches [R] . Bourbakis, D. 2004

机译：使用pFF，LGG和spNG方法检测和挖掘图像序列中的相似性，差异和目标模式

Protein Sequence Pattern Mining with Constraints

摘要

著录项

相似文献

相关主题

期刊订阅