首页> 外文会议>Asia-Pacific Bioinformatics Conference >Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization
【24h】

Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization

机译:蛋白质功能特性通过规则的非负矩阵分解在稀疏标签PPI网络中预测

获取原文

摘要

Background: Predicting functional properties of proteins in protein-protein interaction (PPI) networks presents a challenging problem and has important implication in computational biology. Collective classification (CC) that utilizes both attribute features and relational information to jointly classify related proteins in PPI networks has been shown to be a powerful computational method for this problem setting. Enabling CC usually increases accuracy when given a fully-labeled PPI network with a large amount of labeled data. However, such labels can be difficult to obtain in many real-world PPI networks in which there are usually only a limited number of labeled proteins and there are a large amount of unlabeled proteins. In this case, most of theunlabeled proteins may not connected to the labeled ones, the supervision knowledge cannot be obtained effectively from local network connections. As a consequence, learning a CC model in sparsely-labeled PPI networks can lead to poor performance.Results: We investigate a latent graph approach for finding an integration latent graph by exploiting various latent linkages and judiciously integrate the investigated linkages to link (separate) the proteins with similar (different) functions. We develop a regularized non-negative matrix factorization (RNMF) algorithm for CC to make protein functional properties prediction by utilizing various data sources that are available in this problem setting, including attribute features, latent graph, and unlabeled data information. In RNMF, a label matrix factorization term and a network regularization term are incorporated into the non-negative matrix factorization (NMF) objective function to seek a matrix factorization that respects the network structure and label information for classification prediction. Conclusion: Experimental results on KDD Cup tasks predicting the localization and functions of proteins to yeast genes demonstrate the effectiveness of the proposed RNMF method for predicting the protein properties. In the comparison, we find that the performance of the new method is better than those of the other compared CC algorithms especially in paucity of labeled proteins.
机译:背景技术:预测蛋白质 - 蛋白质相互作用(PPI)网络中蛋白质的功能性质提出了一个具有挑战性的问题,并且对计算生物学具有重要意义。利用属性特征和关系信息的集体分类(CC)已被证明在PPI网络中共同分类相关蛋白质是该问题设置的强大计算方法。当给定具有大量标记数据的完全标记的PPI网络时,启用CC通常会提高精度。然而,在许多现实世界PPI网络中可能难以获得这种标记,其中通常只有有限数量的标记蛋白,并且存在大量的未标记蛋白质。在这种情况下,大多数unlabeled蛋白质可能没有连接到标记的蛋白质,无法从本地网络连接有效获得监督知识。因此,在稀疏标记的PPI网络中学习CC模型可能会导致性能不佳。结果:我们调查通过利用各种潜在联系,明智地整合调查链接来查找集成潜像的潜在图形方法具有相似(不同)功能的蛋白质。我们开发了CC的正则化非负矩阵分解(RNMF)算法,通过利用此问题设置中可用的各种数据来源来制备蛋白质功能特性预测,包括属性特征,潜在图形和未标记的数据信息。在RNMF中,标签矩阵分解项和网络正则化术语被纳入非负矩阵分子(NMF)目标函数,以寻求致矩阵分子,其尊重网络结构和用于分类预测的标签信息。结论:在预测蛋白质鉴定到酵母基因的kdd杯任务的实验结果证明了提出的RNMF方法预测蛋白质特性的有效性。在比较中,我们发现新方法的性能优于其他比较的CC算法,尤其是在标记的蛋白质的缺乏中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号