首页> 外文OA文献 >Identifying GPCR-drug interaction based on wordbook learning from sequences
【2h】

Identifying GPCR-drug interaction based on wordbook learning from sequences

机译:基于序列学习的字母识别GPCR - 药物交互

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Abstract Background G protein-coupled receptors (GPCRs) mediate a variety of important physiological functions, are closely related to many diseases, and constitute the most important target family of modern drugs. Therefore, the research of GPCR analysis and GPCR ligand screening is the hotspot of new drug development. Accurately identifying the GPCR-drug interaction is one of the key steps for designing GPCR-targeted drugs. However, it is prohibitively expensive to experimentally ascertain the interaction of GPCR-drug pairs on a large scale. Therefore, it is of great significance to predict the interaction of GPCR-drug pairs directly from the molecular sequences. With the accumulation of known GPCR-drug interaction data, it is feasible to develop sequence-based machine learning models for query GPCR-drug pairs. Results In this paper, a new sequence-based method is proposed to identify GPCR-drug interactions. For GPCRs, we use a novel bag-of-words (BoW) model to extract sequence features, which can extract more pattern information from low-order to high-order and limit the feature space dimension. For drug molecules, we use discrete Fourier transform (DFT) to extract higher-order pattern information from the original molecular fingerprints. The feature vectors of two kinds of molecules are concatenated and input into a simple prediction engine distance-weighted K-nearest-neighbor (DWKNN). This basic method is easy to be enhanced through ensemble learning. Through testing on recently constructed GPCR-drug interaction datasets, it is found that the proposed methods are better than the existing sequence-based machine learning methods in generalization ability, even an unconventional method in which the prediction performance was further improved by post-processing procedure (PPP). Conclusions The proposed methods are effective for GPCR-drug interaction prediction, and may also be potential methods for other target-drug interaction prediction, or protein-protein interaction prediction. In addition, the new proposed feature extraction method for GPCR sequences is the modified version of the traditional BoW model and may be useful to solve problems of protein classification or attribute prediction. The source code of the proposed methods is freely available for academic research at https://github.com/wp3751/GPCR-Drug-Interaction.
机译:摘要背景G蛋白偶联受体(GPCR)介导各种重要的生理功能,与许多疾病密切相关,并构成了最重要的现代药物目标。因此,GPCR分析和GPCR配体筛选的研究是新药的热点。准确地识别GPCR - 药物相互作用是设计GPCR靶向药物的关键步骤之一。然而,通过大规模地确定GPCR - 药物对的相互作用是昂贵的。因此,预测GPCR药物对直接从分子序列的相互作用具有重要意义。随着已知的GPCR - 药物交互数据的累积,开发基于序列的机器学习模型是可行的用于查询GPCR - 药物对。结果本文提出了一种新的基于序列的方法来鉴定GPCR - 药物相互作用。对于GPCR来说,我们使用新颖的单词袋(弓)模型来提取序列功能,可以从低阶到高阶提取更多的模式信息并限制特征空间尺寸。对于药物分子,我们使用离散的傅里叶变换(DFT)从原始分子指纹中提取高阶模式信息。两种分子的特征向量被连接并输入到简单的预测发动机距离加权K-最近邻居(DWKNN)中。通过集合学习易于增强这种基本方法。通过在最近构建的GPCR - 药物交互数据集上测试,发现所提出的方法优于泛化能力中的现有序列的机器学习方法,甚至是通过后处理程序进一步改善预测性能的非常规方法(PPP)。结论所提出的方法对GPCR - 药物相互作用预测有效,也可以是其他靶药物相互作用预测或蛋白质 - 蛋白质相互作用预测的潜在方法。此外,GPCR序列的新提出特征提取方法是传统弓形模型的修改版本,可用于解决蛋白质分类或属性预测的问题。所提出的方法的源代码在https://github.com/wp3751/gpcr-drug-interaction下自由参加学术研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号