首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Protein-protein interaction network inference from multiple kernels with optimization based on random walk by linear programming
【24h】

Protein-protein interaction network inference from multiple kernels with optimization based on random walk by linear programming

机译:基于线性规划的随机游走优化从多个内核进行蛋白质-蛋白质相互作用网络推断

获取原文

摘要

Reconstruction of PPI networks is a central task in systems biology, and inference from multiple heterogeneous data sources offers a promising computational approach to making de novo PPI prediction by leveraging complementary information and the partial network structure. However, how to quickly learn weights for heterogeneous data sources remains a challenge. In this work, we developed a method to infer de novo PPIs by combining multiple data sources represented in kernel format and obtaining optimal weights based on random walk over the existing partial network. Our proposed method utilizes Baker algorithm and the training data to construct a transition matrix which constrains how a random walk would traverse the partial network. Multiple heterogeneous features for the proteins in the network, including gene expression and Pfam domain profiles, are then combined into the form of a weighted kernel, which provides a new “adjacency matrix” for the whole network but is required to comply with the transition matrix on the part of the training subnetwork. This requirement is met by adjusting the weights to minimize the element-wise difference between the transition matrix and the weighted kernel. The minimization problem is solved by linear programming. The weighted kernel is then transformed to regularized Laplacian (RL) kernel to infer missing or new edges in the PPI network. The results on synthetic data and real data from Yeast show that the accuracy of PPI prediction measured by AUC is increased by up to 19% as compared to a control method without using optimal weights. Moreover, the weights learned by our method Weight Optimization by Linear Programming (WOLP) are very consistent with that learned by sampling, and can provide insights into the relations between PPIs and various feature kernels, thereby improving PPI prediction.
机译:PPI网络的重建是系统生物学的中心任务,并且通过利用互补信息和部分网络结构,从多个异构数据源进行推断提供了一种有希望的计算方法,以进行从头开始的PPI预测。但是,如何快速学习异构数据源的权重仍然是一个挑战。在这项工作中,我们开发了一种通过组合以内核格式表示的多个数据源并基于现有局部网络上的随机游走获得最佳权重来推断从头PPI的方法。我们提出的方法利用Baker算法和训练数据来构造一个转换矩阵,该转换矩阵约束了随机游走如何穿越部分网络。然后,将网络中蛋白质的多个异质特征(包括基因表达和Pfam结构域图)组合为加权核的形式,该核为整个网络提供了新的“邻接矩阵”,但需要遵守过渡矩阵在培训子网中。通过调整权重以最小化过渡矩阵和加权内核之间的逐元素差异,可以满足此要求。最小化问题通过线性编程解决。然后将加权后的核转换为正则化的Laplacian(RL)核,以推断PPI网络中的缺失边缘或新边缘。来自Yeast的合成数据和实际数据的结果表明,与不使用最佳权重的控制方法相比,通过AUC测量的PPI预测准确性提高了19%。此外,通过我们的线性规划权重优化(WOLP)方法学习的权重与通过采样获得的权重非常一致,并且可以提供对PPI与各种特征核之间关系的洞察力,从而改善PPI预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号