首页> 外文会议>International Conference on Machine Learning and Applications >Variable-Length Protein Sequence Motif Extraction Using Hierarchically-Clustered Hidden Markov Models
【24h】

Variable-Length Protein Sequence Motif Extraction Using Hierarchically-Clustered Hidden Markov Models

机译:使用分层聚类的隐马尔可夫模型提取长度可变的蛋白质序列基序

获取原文
获取外文期刊封面目录资料

摘要

Primary sequence motif extraction from protein amino sequences is a field of growing importance in bioinformatics due to its relevance to both sequential and structural analysis. Many approaches for motif extraction include two limitations: a reliance on discovering an existing, known protein homologue to perform motif extraction or structural analysis, and an assumed motif length. This work would propose the Hierarchically Clustered-Hidden Markov Model approach, which represents the behavior and structure of proteins in terms of a Hidden Markov Model chain and hierarchically clusters each chain by minimizing distance between two given chain's structure and behavior. It is well known that HMM can be utilized for clustering purpose, however, methods for clustering on Hidden Markov Models themselves are rarely studied. In this paper, we proposed a hierarchical clustering based algorithm for HMMs to discover protein sequence motifs that transcend family boundaries with no assumption on the length of the motif. This paper carefully examines the effectiveness of this approach for motif extraction on 2, 593 proteins that share no more than 25% sequence identity. Many interesting motifs are generated. Three example motifs generated by the HC-HMM approach are analyzed and visualized with their tertiary structure.
机译:由于其与顺序分析和结构分析的相关性,从蛋白质氨基序列中提取一级序列基序在生物信息学中是一个日益重要的领域。许多提取基序的方法包括两个局限性:依赖于发现现有的已知蛋白质同源物以进行基序提取或结构分析,以及假定的基序长度。这项工作将提出一种“层次聚类-隐式马尔可夫模型”方法,该方法用“隐式马尔可夫”模型链表示蛋白质的行为和结构,并通过最小化两个给定链的结构和行为之间的距离来对每个链进行层次式聚类。众所周知,HMM可以用于聚类目的,但是,很少研究在隐马尔可夫模型本身上进行聚类的方法。在本文中,我们为HMM提出了一种基于层次聚类的算法,以发现超越家族边界的蛋白序列基序,而无需假设基序的长度。本文仔细检查了此方法对2 593个蛋白质的共有基序不超过25%的序列同一性的提取的有效性。产生了许多有趣的图案。通过HC-HMM方法生成的三个示例主题已通过其三级结构进行了分析和可视化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号