首页> 美国卫生研究院文献>Proteome Science >Fully automated protein complex prediction based on topological similarity and community structure
【2h】

Fully automated protein complex prediction based on topological similarity and community structure

机译:基于拓扑相似度和群落结构的全自动蛋白质复合物预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

To understand the function of protein complexes and their association with biological processes, a lot of studies have been done towards analyzing the protein-protein interaction (PPI) networks. However, the advancement in high-throughput technology has resulted in a humongous amount of data for analysis. Moreover, high level of noise, sparseness, and skewness in degree distribution of PPI networks limits the performance of many clustering algorithms and further analysis of their interactions.In addressing and solving these problems we present a novel random walk based algorithm that converts the incomplete and binary PPI network into a protein-protein topological similarity matrix (PP-TS matrix). We believe that if two proteins share some high-order topological similarities they are likely to be interacting with each other. Using the obtained PP-TS matrix, we constructed and used weighted networks to further study and analyze the interaction among proteins. Specifically, we applied a fully automated community structure finding algorithm (Auto-HQcut) on the obtained weighted network to cluster protein complexes. We then analyzed the protein complexes for significance in biological processes. To help visualize and analyze these protein complexes we also developed an interface that displays the resulting complexes as well as the characteristics associated with each complex.Applying our approach to a yeast protein-protein interaction network, we found that the predicted protein-protein interaction pairs with high topological similarities have more significant biological relevance than the original protein-protein interactions pairs. When we compared our PPI network reconstruction algorithm with other existing algorithms using gene ontology and gene co-expression, our algorithm produced the highest similarity scores. Also, our predicted protein complexes showed higher accuracy measure compared to the other protein complex predictions.
机译:为了了解蛋白质复合物的功能及其与生物学过程的关系,已经进行了许多研究,以分析蛋白质-蛋白质相互作用(PPI)网络。但是,高通量技术的进步导致大量的数据需要分析。此外,PPI网络度分布中的高噪声,稀疏和偏斜度限制了许多聚类算法的性能以及对其相互作用的进一步分析。在解决这些问题时,我们提出了一种新颖的基于随机游走的算法,该算法可以将不完全将二进制PPI网络转换成蛋白质-蛋白质拓扑相似性矩阵(PP-TS矩阵)。我们认为,如果两种蛋白质共享某些高阶拓扑相似性,则它们很可能会相互作用。使用获得的PP-TS矩阵,我们构建并使用加权网络进一步研究和分析蛋白质之间的相互作用。具体来说,我们在获得的加权网络上应用了一种全自动的社区结构发现算法(Auto-HQcut)来对蛋白质复合物进行聚类。然后,我们分析了蛋白质复合物在生物学过程中的重要性。为了帮助可视化和分析这些蛋白质复合物,我们还开发了一个界面,该界面显示所得复合物以及与每种复合物相关的特征。将我们的方法应用于酵母蛋白质-蛋白质相互作用网络中,我们发现预测的蛋白质-蛋白质相互作用对具有高拓扑相似性的生物相关性比原始蛋白质-蛋白质相互作用对更重要。当我们将PPI网络重建算法与其他使用基因本体论和基因共表达的现有算法进行比较时,我们的算法产生了最高的相似性评分。此外,与其他蛋白质复合物预测相比,我们预测的蛋白质复合物显示出更高的准确度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号