首页> 外文期刊>BMC Genomics >Integrating experimental and literature protein-protein interaction data for protein complex prediction
【24h】

Integrating experimental and literature protein-protein interaction data for protein complex prediction

机译:整合实验和文献蛋白质-蛋白质相互作用数据以预测蛋白质复合物

获取原文
           

摘要

Background Accurate determination of protein complexes is crucial for understanding cellular organization and function. High-throughput experimental techniques have generated a large amount of protein-protein interaction (PPI) data, allowing prediction of protein complexes from PPI networks. However, the high-throughput data often includes false positives and false negatives, making accurate prediction of protein complexes difficult. Method The biomedical literature contains large quantities of PPI data that, along with high-throughput experimental PPI data, are valuable for protein complex prediction. In this study, we employ a natural language processing technique to extract PPI data from the biomedical literature. This data is subsequently integrated with high-throughput PPI and gene ontology data by constructing attributed PPI networks, and a novel method for predicting protein complexes from the attributed PPI networks is proposed. This method allows calculation of the relative contribution of high-throughput and biomedical literature PPI data. Results Many well-characterized protein complexes are accurately predicted by this method when apply to two different yeast PPI datasets. The results show that (i) biomedical literature PPI data can effectively improve the performance of protein complex prediction; (ii) our method makes good use of high-throughput and biomedical literature PPI data along with gene ontology data to achieve state-of-the-art protein complex prediction capabilities.
机译:背景技术蛋白质复合物的准确测定对于了解细胞的组织和功能至关重要。高通量实验技术已经产生了大量的蛋白质-蛋白质相互作用(PPI)数据,从而可以从PPI网络预测蛋白质复合物。但是,高通量数据通常包含假阳性和假阴性,因此很难准确预测蛋白质复合物。方法生物医学文献包含大量的PPI数据,以及高通量的实验PPI数据,对于蛋白质复合物的预测非常有价值。在这项研究中,我们采用自然语言处理技术从生物医学文献中提取PPI数据。通过构建归因于PPI的网络,此数据随后与高通量PPI和基因本体数据集成在一起,并提出了一种从归因于PPI的网络预测蛋白质复合物的新方法。这种方法可以计算高通量和生物医学文献PPI数据的相对贡献。结果当应用于两个不同的酵母PPI数据集时,此方法可以准确预测许多特征明确的蛋白复合物。结果表明:(i)生物医学文献中的PPI数据可以有效地提高蛋白质复合物预测的性能; (ii)我们的方法充分利用了高通量和生物医学文献中的PPI数据以及基因本体数据,以实现最新的蛋白质复合物预测能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号