...
首页> 外文期刊>Proteome science >Accuracy improvement in protein complex prediction from protein interaction networks by refining cluster overlaps
【24h】

Accuracy improvement in protein complex prediction from protein interaction networks by refining cluster overlaps

机译:通过精炼簇重叠,从蛋白质相互作用网络预测蛋白质复合物的准确性提高

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Recent computational techniques have facilitated analyzing genome-wide protein-protein interaction data for several model organisms. Various graph-clustering algorithms have been applied to protein interaction networks on the genomic scale for predicting the entire set of potential protein complexes. In particular, the density-based clustering algorithms which are able to generate overlapping clusters, i.e. the clusters sharing a set of nodes, are well-suited to protein complex detection because each protein could be a member of multiple complexes. However, their accuracy is still limited because of complex overlap patterns of their output clusters. Results We present a systematic approach of refining the overlapping clusters identified from protein interaction networks. We have designed novel metrics to assess cluster overlaps: overlap coverage and overlapping consistency. We then propose an overlap refinement algorithm. It takes as input the clusters produced by existing density-based graph-clustering methods and generates a set of refined clusters by parameterizing the metrics. To evaluate protein complex prediction accuracy, we used the f -measure by comparing each refined cluster to known protein complexes. The experimental results with the yeast protein-protein interaction data sets from BioGRID and DIP demonstrate that accuracy on protein complex prediction has increased significantly after refining cluster overlaps. Conclusions The effectiveness of the proposed cluster overlap refinement approach for protein complex detection has been validated in this study. Analyzing overlaps of the clusters from protein interaction networks is a crucial task for understanding of functional roles of proteins and topological characteristics of the functional systems.
机译:背景技术最近的计算技术已经促进了对几种模型生物的全基因组蛋白质-蛋白质相互作用数据的分析。各种图聚类算法已应用于基因组规模的蛋白质相互作用网络,以预测整套潜在的蛋白质复合物。特别地,能够产生重叠簇(即,共享一组节点的簇)的基于密度的聚类算法非常适合蛋白质复合物检测,因为每种蛋白质可以是多个复合物的成员。但是,由于其输出群集的复杂重叠模式,其准确性仍然受到限制。结果我们提出了一种系统的方法,可以完善从蛋白质相互作用网络中识别出的重叠簇。我们设计了新颖的指标来评估集群重叠:重叠覆盖率和重叠一致性。然后,我们提出一种重叠优化算法。它以现有的基于密度的图聚类方法生成的聚类作为输入,并通过参数化指标来生成一组精炼的聚类。为了评估蛋白质复合物的预测准确性,我们通过将每个精制簇与已知蛋白质复合物进行比较来使用f-度量。来自BioGRID和DIP的酵母蛋白质-蛋白质相互作用数据集的实验结果表明,精炼簇重叠后,蛋白质复合物预测的准确性显着提高。结论本研究验证了所提出的簇重叠细化方法对蛋白质复合物检测的有效性。从蛋白质相互作用网络分析簇的重叠是了解蛋白质功能作用和功能系统拓扑特征的关键任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号