...
首页> 外文期刊>Molecular genetics and genomics: MGG >Identification of compound-protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds
【24h】

Identification of compound-protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds

机译:通过基因本体分析,KEGG富集蛋白质和化合物分子片段来鉴定化合物-蛋白质相互作用

获取原文
获取原文并翻译 | 示例
           

摘要

Compound-protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound-protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound-protein interactions. Compound-protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound-protein interactions. This work provides new clues to understanding the system of compound-protein interactions by analyzing extracted core features. Our results indicate that compound-protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.
机译:通过识别和调节特定功能蛋白,化合物-蛋白相互作用在每个细胞中都起着重要作用。正确识别化合物与蛋白质的相互作用可以很好地理解这个复杂的系统,并为研究化合物和蛋白质的各种属性提供有用的输入。在这项研究中,我们试图通过从蛋白质和化合物中提取属性来理解该系统,其中蛋白质由基因本体表示,KEGG途径富集得分,化合物由分子片段表示。先进的特征选择方法,包括最小冗余最大相关性,增量特征选择和基本的机器学习算法随机森林,被用来分析这些特性并提取核心因素以确定实际的化合物-蛋白质相互作用。结合数据库中报道的化合物与蛋白质的相互作用被用作阳性样品。为了提高结果的可靠性,使用不同的阴性样品执行了五次分析程序。同时,构建了五种基于随机森林并产生约77.55%的最大MCC的最佳预测方法,这些方法可能是预测化合物与蛋白质相互作用的有用工具。这项工作为通过分析提取的核心特征来理解化合物-蛋白质相互作用系统提供了新的线索。我们的结果表明,化合物与蛋白质的相互作用与涉及免疫,发育和激素相关途径的生物学过程有关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号