Discovery of Non-induced Patterns from Sequences

机译：从序列发现非诱导模式

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Discovering patterns from sequence data has significant impact in genomics, proteomics and business. A problem commonly encountered is that the patterns discovered often contain many redundancies resulted from fake significant patterns induced by their strong statistically significant subpatterns. The concept of statistically induced patterns is proposed to capture these redundancies. An algorithm is then developed to efficiently discover non-induced significant patterns from a large sequence dataset. For performance evaluation, two experiments were conducted to demonstrate a) the seriousness of the problem using synthetic data and b) top non-induced significant patterns discovered from Saccharomyces cerevisiae (Yeast) do correspond to the transcription factor binding sites found by the biologists. The experiments confirm the effectiveness of our method in generating a relatively small set of patterns revealing interesting, unknown information inherent in the sequences.

机译：发现序列数据的模式对基因组学，蛋白质组学和业务产生重大影响。通常遇到的问题是发现的模式通常包含许多冗余，这些冗余由其强大的统计学意义的偶像天特素诱导的虚假重要模式引起。提出了统计上诱导的模式的概念来捕获这些冗余。然后开发了一种算法以有效地从大序列数据集中发现非引起的显着模式。对于性能评估，进行了两个实验以证明a）使用合成数据的问题的严重性，b）从酿酒酵母（酵母）发现的冠状非诱导的显着模式对应于生物学家发现的转录因子结合位点。该实验证实了我们在产生相对较小的模式中产生相对较小的模式的方法的有效性，揭示序列中固有的有趣，未知的信息。

著录项

来源
《International Conference on Pattern Recognition in Bioinformatics》|2010年||共12页
会议地点
作者
Andrew K. C. Wong; Dennis Zhuang; Gary C. L. Li; En-Shiun Annie Lee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 Q-53;
关键词
Sequence Pattern Discovery; Statistically Induced Patterns; Suffix Tree;

机译：序列模式发现;统计诱导的模式;后缀树;

相似文献

外文文献
中文文献
专利

1. 序列模式中知识发现问题描述与知识发现方法研究 [J] . 殷国富, 姜华, 龙红能, 上海大学学报（英文版） . 2004,第0z1期
2. 序列模式中知识发现问题描述与知识发现方法研究 [J] . 殷国富, 姜华, 龙红能, 上海大学学报：英文版 . 2004,第A01期
3. Analysis of the relationships among Longest Common Subsequences, Shortest Common Supersequences and patterns and its application on pattern discovery in biological sequences [J] . Ning K., Ng H.K., Leong H.W. International journal of data mining and bioinformatics . 2011,第6期

机译：最长共同子序列，最短共同超序列与模式之间的关系分析及其在生物序列模式发现中的应用
4. Discovery of Delta Closed Patterns and Noninduced Patterns from Sequences [J] . Wong Andrew K.C. Knowledge and Data Engineering, IEEE Transactions on . 2012,第8期

机译：从序列中发现Delta闭合模式和非诱导模式
5. Computational discovery of feature patterns in nucleosomal DNA sequences [J] . Zheng Yiyu, Li Xiaoman, Hu Haiyan Genomics . 2014,第2期

机译：核小体DNA序列中特征模式的计算发现
6. Discovery of Non-induced Patterns from Sequences [C] . Andrew K.C. Wong, Dennis Zhuang, Gary C.L. Li, Pattern recognition in bioinformatics . 2010

机译：从序列中发现非诱导模式
7. Pattern Discovery in DNA Sequences. [D] . Yan, Rui. 2012

机译：DNA序列中的模式发现。
8. MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences [O] . Chen-Ming Hsu, Chien-Yu Chen, Baw-Jhiune Liu 2006

机译：MAGIIC-PRO：通过有效发现蛋白质序列中的长模式来检测功能标记
9. Discovery of Non-induced Patterns from Sequences [O] . Andrew K. C. Wong, Dennis Zhuang, Gary C. L. Li, 2010

机译：从序列发现非诱导模式

Discovery of Non-induced Patterns from Sequences

摘要

著录项

相似文献

相关主题

期刊订阅