Enhancing Protein Domain Detection Using Domain Co-occurrence and Domain Exclusion

机译：使用域共现和域排除来增强蛋白质域检测

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Among the relevant annotations that can be at-tributed to a protein, domains occupy a key position. Protein domains are sequential and structural motifs that are found independently in different proteins and in different combinations. One of the most widely used domain scheme is the Pfam database which is a collection of protein domain and families. Each family in Pfam is represented by a multiple sequence alignment and a Hidden Markov Model (HMM).When analyzing a new protein sequence, each Pfam HMM is used to compute a score measuring the similarity between the sequence and the domain. If the score is above a given threshold provided by Pfam, the presence of the domain can be asserted in the protein. However, when applied to proteins of organisms with high evolutionary distance from classical model organisms, this strategy may miss several domains. We recently proposed a method, the Co-Occurrence Domain Detection approach (CODD), that improves the sensitivity of Pfam domain detection by exploiting the tendency of domains to appear preferentially with a few other favorite domains in a protein. Here, we propose to integrate domain exclusion information to prune false positive domains that are in conflict with other domains of the protein. Applied to P. falciparum and L. major proteins, we show that this strategy allows to substantially reduce the proportion of false positives among the new domains predicted by CODD, while preserving as much as possible the sensitivity of the approach.

机译：在可以归因于蛋白质的相关注释中，结构域占据关键位置。蛋白质域是在不同蛋白质中以不同组合独立存在的顺序和结构基序。 Pfam数据库是使用最广泛的域方案之一，该数据库是蛋白质域和家族的集合。 Pfam中的每个家族都由多重序列比对和隐马尔可夫模型（HMM）表示。在分析新的蛋白质序列时，每个Pfam HMM用于计算分数，以测量序列与结构域之间的相似性。如果分数高于Pfam提供的给定阈值，则可以在蛋白质中断定结构域的存在。但是，将其应用于与经典模型生物进化距离较远的生物的蛋白质时，该策略可能会缺失多个域。我们最近提出了一种共现域检测方法（CODD），该方法通过利用域与蛋白质中其他几个最喜欢的域优先出现的趋势来提高Pfam域检测的灵敏度。在这里，我们建议将域排除信息整合到修剪与蛋白质其他域冲突的假阳性域。应用于恶性疟原虫和L.主要蛋白质，我们表明，这种策略可以大大减少由CODD预测的新域中假阳性的比例，同时尽可能保留该方法的敏感性。

著录项

来源
《Twenty-Third International Workshop on Database and Expert Systems Applications.》|2012年|p.223- 228|共6页
会议地点 Vienna(AT)
作者
Ghouila Amel; Gascuel Olivier; Yahia Sadok Ben; Brehelin Laurent;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. Detection of new protein domains using co-occurrence: application to Plasmodium falciparum [J] . Terrapon Nicolas, Gascuel Olivier, Marechal Eric, Bioinformatics . 2009,第23期

机译：使用共现检测新的蛋白质结构域：在恶性疟原虫中的应用
2. Detection of new protein domains using co-occurrence: application to Plasmodium falciparum [J] . Nicolas Terrapon12 Olivier Gascuel1 Éric Maréchal2 and Laurent Bréehélin1* Bioinformatics . 2009,第23期

机译：使用共现检测新的蛋白质结构域：在恶性疟原虫中的应用
3. N-terminal domains of native multidomain proteins have the potential to assist de novo folding of their downstream domains in vivo by acting as solubility enhancers. [J] . Kim CW, Han KS, Ryu KS, Protein Science: A Publication of the Protein Society . 2007,第4期

机译：天然多结构域蛋白的N端结构域具有作为溶解度增强剂的作用，可在体内帮助其下游结构域从头折叠。
4. Enhancing Protein Domain Detection Using Domain Co-occurrence and Domain Exclusion [C] . Ghouila Amel, Gascuel Olivier, Yahia Sadok Ben, International Workshop on Database and Expert Systems Applications . 2012

机译：使用域共发生和域排除增强蛋白质域检测
5. Revealing the conformation and properties of human genome, protein molecules and protein domain co-occurrence network. [D] . Wang, Zheng. 2013

机译：揭示人类基因组，蛋白质分子和蛋白质结构域共现网络的构象和特性。
6. Identification of Divergent Protein Domains by Combining HMM-HMM Comparisons and Co-Occurrence Detection [O] . Amel Ghouila, Isabelle Florent, Fatma Zahra Guerfali, -1

机译：结合HMM-HMM比较和共现检测鉴定不同的蛋白质结构域
7. Identification of divergent protein domains by combining HMM-HMM comparisons and co-occurrence detection [O] . Ghouila, Amel, Florent, Isabelle, Guerfali, Fatma Zahra, 2014

机译：通过结合HMM-HMM比较和共现检测鉴定不同的蛋白质结构域

Enhancing Protein Domain Detection Using Domain Co-occurrence and Domain Exclusion

摘要

著录项

相似文献

相关主题

期刊订阅