首页> 外文会议>International Conference on Machine Learning and Applications >Variable-Length Protein Sequence Motif Extraction Using Hierarchically-Clustered Hidden Markov Models
【24h】

Variable-Length Protein Sequence Motif Extraction Using Hierarchically-Clustered Hidden Markov Models

机译:使用分层簇状的隐马尔可夫模型提取可变长度蛋白质序列图案

获取原文

摘要

Primary sequence motif extraction from protein amino sequences is a field of growing importance in bioinformatics due to its relevance to both sequential and structural analysis. Many approaches for motif extraction include two limitations: a reliance on discovering an existing, known protein homologue to perform motif extraction or structural analysis, and an assumed motif length. This work would propose the Hierarchically Clustered-Hidden Markov Model approach, which represents the behavior and structure of proteins in terms of a Hidden Markov Model chain and hierarchically clusters each chain by minimizing distance between two given chain's structure and behavior. It is well known that HMM can be utilized for clustering purpose, however, methods for clustering on Hidden Markov Models themselves are rarely studied. In this paper, we proposed a hierarchical clustering based algorithm for HMMs to discover protein sequence motifs that transcend family boundaries with no assumption on the length of the motif. This paper carefully examines the effectiveness of this approach for motif extraction on 2, 593 proteins that share no more than 25% sequence identity. Many interesting motifs are generated. Three example motifs generated by the HC-HMM approach are analyzed and visualized with their tertiary structure.
机译:来自蛋白质氨基序列的初级序列基序是由于其与顺序和结构分析的相关性,因此具有生物信息学的重要性。用于基序提取的许多方法包括两个限制:依赖于发现现有的已知蛋白质同源物以进行基序提取或结构分析,以及假设的基序长度。这项工作将提出分层集群隐藏的马尔可夫模型方法,该模型方法是通过最小化两个给定链的结构和行为之间的距离来表示隐马尔可夫模型链和分层簇的蛋白质的行为和结构。众所周知,HMM可以用于聚类目的,然而,很少研究在隐藏的马尔可夫模型上聚类方法。在本文中,我们提出了一种基于分层聚类的HMM算法,以发现蛋白质序列图案,其超越家庭边界,在图案的长度上没有假设。本文仔细研究了这种含量不超过25%序列同一性的2,593蛋白的基序提取方法的效果。生成了许多有趣的主题。通过HC-HMM方法产生的三个示例基同,并通过其三级结构进行了可视化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号