首页> 外文期刊>Bioinformatics >An efficient strategy for extensive integration of diverse biological data for protein function prediction
【24h】

An efficient strategy for extensive integration of diverse biological data for protein function prediction

机译:广泛整合各种生物学数据以预测蛋白质功能的有效策略

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: With the increasing availability of diverse biological information, protein function prediction approaches have converged towards integration of heterogeneous data. Many adapted existing techniques, such as machine-learning and probabilistic methods, which have proven successful on specific data types. However, the impact of these approaches is hindered by a couple of factors. First, there is little comparison between existing approaches. This is in part due to a divergence in the focus adopted by different works, which makes comparison difficult or even fuzzy. Second, there seems to be over-emphasis on the use of computationally demanding machine-learning methods, which runs counter to the surge in biological data. Analogous to the success of BLAST for sequence homology search, we believe that the ability to tap escalating quantity, quality and diversity of biological data is crucial to the success of automated function prediction as a useful instrument for the advancement of proteomic research. We address these problems by: (1) providing useful comparison between some prominent methods; (2) proposing Integrated Weighted Averaging (IWA)-a scalable, efficient and flexible function prediction framework that integrates diverse information using simple weighting strategies and a local prediction method. The simplicity of the approach makes it possible to make predictions based on on-the-fly information fusion. Results: In addition to its greater efficiency, IWA performs exceptionally well against existing approaches. In the presence of cross-genome information, which is overwhelming for existing approaches, IWA makes even better predictions. We also demonstrate the significance of appropriate weighting strategies in data integration.
机译:动机:随着多样化的生物信息的日益普及,蛋白质功能预测方法已趋向于整合异构数据。许多已改编的现有技术,例如机器学习和概率方法,已证明在特定数据类型上是成功的。但是,这些方法的影响受到两个因素的阻碍。首先,现有方法之间几乎没有比较。这部分是由于不同作品采用的重点不同,使得比较变得困难甚至模糊。其次,似乎过分强调了对计算要求很高的机器学习方法的使用,这与生物学数据的激增背道而驰。类似于BLAST在序列同源性搜索中的成功,我们认为挖掘逐步增加的生物数据的数量,质量和多样性的能力对于自动化功能预测作为蛋白质组学研究的有用工具的成功至关重要。我们通过以下方法解决这些问题:(1)在一些主要方法之间进行有用的比较; (2)提出了集成加权平均(IWA)-一种可扩展,高效且灵活的功能预测框架,该框架使用简单的加权策略和局部预测方法来集成各种信息。该方法的简单性使得可以基于实时信息融合进行预测。结果:除了提高效率外,IWA在现有方法上的表现也非常出色。在存在跨基因组信息的情况下,现有方法不堪重负,IWA做出了更好的预测。我们还演示了适当的加权策略在数据集成中的重要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号