首页> 外文会议>Research in computational molecular biology. >Predicting Protein-Protein Interactions from Multimodal Biological Data Sources via Nonnegative Matrix Tri-Factorization
【24h】

Predicting Protein-Protein Interactions from Multimodal Biological Data Sources via Nonnegative Matrix Tri-Factorization

机译:通过非负矩阵三因子预测多模式生物数据源中的蛋白质-蛋白质相互作用。

获取原文
获取原文并翻译 | 示例

摘要

Due to the high false positive rate in the high-throughput experimental methods to discover protein interactions, computational methods are necessary and crucial to complete the interactome expe-ditiously. However, when building classification models to identify putative protein interactions, compared to the obvious choice of positive samples from truly interacting protein pairs, it is usually very hard to select negative samples, because non-interacting protein pairs refer to those currently without experimental or computational evidence to support a physical interaction or a functional association, which, though, could interact in reality. To tackle this difficulty, instead of using heuristics as in many existing works, in this paper we solve it in a principled way by formulating the protein interaction prediction problem from a new mathematical perspective of view — sparse matrix completion, and propose a novel Nonnegative Matrix Tri-Factorization (NMTF) based matrix completion approach to predict new protein interactions from existing protein interaction networks. Because matrix completion only requires positive samples but not use negative samples, the challenge in existing classification based methods for protein interaction prediction is circumvented. Through using manifold regularization, we further develop our method to integrate different biological data sources, such as protein sequences, gene expressions, protein structure information, etc. Extensive experimental results on Saccharomyces cerevisiae genome show that our new methods outperform related state-of-the-art protein interaction prediction methods.
机译:由于高通量实验方法发现蛋白质相互作用的假阳性率很高,因此计算方法对于迅速完成相互作用组是必要且至关重要的。但是,当建立分类模型以识别推定的蛋白质相互作用时,与从真正相互作用的蛋白对中明显选择阳性样品相比,通常很难选择阴性样品,因为非相互作用的蛋白对是指目前没有实验或计算的蛋白对支持物理互动或功能关联的证据,尽管可以在现实中进行互动。为了解决这一难题,本文不再像许多现有著作中那样使用启发式方法,而是通过一种新的数学观点-稀疏矩阵完成-提出蛋白质相互作用预测问题,以有原则的方式解决该问题,并提出一种新颖的非负矩阵基于三因子(NMTF)的基质完成方法可从现有蛋白质相互作用网络预测新的蛋白质相互作用。由于基质完成仅需要阳性样品,而无需使用阴性样品,因此可以避免现有基于分类的蛋白质相互作用预测方法所面临的挑战。通过使用流形正则化,我们进一步开发了整合不同生物数据源(例如蛋白质序列,基因表达,蛋白质结构信息等)的方法。对酿酒酵母基因组的大量实验结果表明,我们的新方法优于相关的现状蛋白质相互作用预测方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号