首页> 美国卫生研究院文献>Bioinformatics >Training set expansion: an approach to improving the reconstruction of biological networks from limited and uneven reliable interactions
【2h】

Training set expansion: an approach to improving the reconstruction of biological networks from limited and uneven reliable interactions

机译:培训集扩展:一种通过有限和不均衡的可靠交互来改善生物网络重建的方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: An important problem in systems biology is reconstructing complete networks of interactions between biological objects by extrapolating from a few known interactions as examples. While there are many computational techniques proposed for this network reconstruction task, their accuracy is consistently limited by the small number of high-confidence examples, and the uneven distribution of these examples across the potential interaction space, with some objects having many known interactions and others few.>Results: To address this issue, we propose two computational methods based on the concept of training set expansion. They work particularly effectively in conjunction with kernel approaches, which are a popular class of approaches for fusing together many disparate types of features. Both our methods are based on semi-supervised learning and involve augmenting the limited number of gold-standard training instances with carefully chosen and highly confident auxiliary examples. The first method, prediction propagation, propagates highly confident predictions of one local model to another as the auxiliary examples, thus learning from information-rich regions of the training network to help predict the information-poor regions. The second method, kernel initialization, takes the most similar and most dissimilar objects of each object in a global kernel as the auxiliary examples. Using several sets of experimentally verified protein–protein interactions from yeast, we show that training set expansion gives a measurable performance gain over a number of representative, state-of-the-art network reconstruction methods, and it can correctly identify some interactions that are ranked low by other methods due to the lack of training examples of the involved proteins.>Contact: >Availability: The datasets and additional materials can be found at .
机译:>动机:系统生物学中的一个重要问题是,通过从一些已知的交互作用中推断出示例,来重建生物对象之间相互作用的完整网络。尽管针对此网络重构任务提出了许多计算技术,但其准确性始终受到少数高可信度示例以及这些示例在潜在交互空间中分布不均匀的限制,其中某些对象具有许多已知的交互作用,而其他对象>结果:为解决此问题,我们基于训练集扩展的概念提出了两种计算方法。它们与内核方法结合使用特别有效,而内核方法是将许多不同类型的特征融合在一起的流行方法。我们的两种方法均基于半监督学习,并涉及通过精心选择且高度自信的辅助示例来增加有限数量的金标准训练实例。第一种方法是预测传播,将一个局部模型的高度可信的预测传播到另一个局部模型作为辅助示例,从而从训练网络的信息丰富的区域中学习以帮助预测信息贫乏的区域。第二种方法,内核初始化,将全局内核中每个对象的最相似和最不相似的对象作为辅助示例。通过使用几组经过实验验证的酵母蛋白相互作用,我们证明训练集扩展与许多代表性的最新网络重构方法相比,具有可衡量的性能提升,并且可以正确识别一些相互作用。由于缺乏相关蛋白质的训练实例,因此在其他方法中排名较低。>联系方式: >可用性:数据集和其他材料可以在上找到。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号