首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Protein-protein interaction network inference from multiple kernels with optimization based on random walk by linear programming
【24h】

Protein-protein interaction network inference from multiple kernels with optimization based on random walk by linear programming

机译:蛋白质 - 蛋白质相互作用网络推论来自多个内核的基于线性编程基于随机步行的优化

获取原文

摘要

Reconstruction of PPI networks is a central task in systems biology, and inference from multiple heterogeneous data sources offers a promising computational approach to making de novo PPI prediction by leveraging complementary information and the partial network structure. However, how to quickly learn weights for heterogeneous data sources remains a challenge. In this work, we developed a method to infer de novo PPIs by combining multiple data sources represented in kernel format and obtaining optimal weights based on random walk over the existing partial network. Our proposed method utilizes Baker algorithm and the training data to construct a transition matrix which constrains how a random walk would traverse the partial network. Multiple heterogeneous features for the proteins in the network, including gene expression and Pfam domain profiles, are then combined into the form of a weighted kernel, which provides a new “adjacency matrix” for the whole network but is required to comply with the transition matrix on the part of the training subnetwork. This requirement is met by adjusting the weights to minimize the element-wise difference between the transition matrix and the weighted kernel. The minimization problem is solved by linear programming. The weighted kernel is then transformed to regularized Laplacian (RL) kernel to infer missing or new edges in the PPI network. The results on synthetic data and real data from Yeast show that the accuracy of PPI prediction measured by AUC is increased by up to 19% as compared to a control method without using optimal weights. Moreover, the weights learned by our method Weight Optimization by Linear Programming (WOLP) are very consistent with that learned by sampling, and can provide insights into the relations between PPIs and various feature kernels, thereby improving PPI prediction.
机译:PPI网络的重建是系统生物学中的中心任务,并且来自多个异构数据源的推断提供了通过利用互补信息和部分网络结构来制备DE Novo PPI预测的有希望的计算方法。但是,如何快速学习异构数据源的重量仍然是一个挑战。在这项工作中,我们通过组合在内核格式中表示的多个数据源并基于随机步行而不是在现有的部分网络上获得最佳权重的方法来开发一种方法。我们所提出的方法利用Baker算法和训练数据来构建转换矩阵,该转换矩阵限制了如何随机步行将如何遍历部分网络。然后将网络中蛋白质的多种异质特征,包括基因表达和PFAM域分布,并将其组合成加权核的形式,为整个网络提供新的“邻接矩阵”,但需要符合转换矩阵在训练子网的一部分。通过调整权重来满足该要求以最小化转换矩阵和加权内核之间的元素 - 方向差异。最小化问题通过线性规划解决。然后将加权内核转换为正规化的LAPLACIAN(RL)内核,以推断PPI网络中的缺失或新边缘。来自酵母的合成数据和实际数据的结果表明,与控制方法相比,AUC测量的PPI预测的准确性增加了高达19%,而不使用最佳权重。此外,通过线性编程(WOLP)的方法权重优化的重量与通过采样学制的基本非常一致,并且可以提供对PPI和各种特征内核之间的关系的见解,从而提高PPI预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号