首页> 外文期刊>Bioinformatics >Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information
【24h】

Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information

机译:使用结构和进化信息预测非同义单核苷酸多态性的表型效应

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: There has been great expectation that the knowledge of an individual's genotype will provide a basis for assessing susceptibility to diseases and designing individualized therapy. Non-synonymous single nucleotide polymorphisms (nsSNPs) that lead to an amino acid change in the protein product are of particular interest because they account for nearly half of the known genetic variations related to human inherited diseases. To facilitate the identification of disease-associated nsSNPs from a large number of neutral nsSNPs, it is important to develop computational tools to predict the phenotypic effects of nsSNPs.Results: We prepared a training set based on the variant phenotypic annotation of the Swiss-Prot database and focused our analysis on nsSNPs having homologous 3D structures. Structural environment parameters derived from the 3D homologous structure as well as evolutionary information derived from the multiple sequence alignment were used as predictors. Two machine learning methods, support vector machine and random forest, were trained and evaluated. We compared the performance of our method with that of the SIFT algorithm, which is one of the best predictive methods to date. An unbiased evaluation study shows that for nsSNPs with sufficient evolutionary information (with not < 10 homologous sequences), the performance of our method is comparable with the SIFT algorithm, while for nsSNPs with insufficient evolutionary information (< 10 homologous sequences), our method outperforms the SIFT algorithm significantly. These findings indicate that incorporating structural information is critical to achieving good prediction accuracy when sufficient evolutionary information is not available.
机译:动机:人们对个人基因型的知识寄予厚望,这将为评估疾病的易感性和设计个性化疗法提供基础。引起蛋白质产物氨基酸变化的非同义单核苷酸多态性(nsSNPs)特别令人关注,因为它们占与人类遗传性疾病有关的已知遗传变异的近一半。为了方便从大量中性nsSNPs中识别与疾病相关的nsSNPs,开发预测nsSNPs表型效应的计算工具非常重要。结果:我们基于Swiss-Prot的表型变异注记编写了一套训练集数据库,并将我们的分析集中在具有同源3D结构的nsSNP上。源自3D同源结构的结构环境参数以及源自多序列比对的进化信息被用作预测因子。训练和评估了两种机器学习方法,即支持向量机和随机森林。我们将我们的方法的性能与SIFT算法的性能进行了比较,后者是迄今为止最好的预测方法之一。公正的评估研究表明,对于具有足够进化信息(不小于10个同源序列)的nsSNP,我们的方法的性能可与SIFT算法相媲美,而对于具有不足进化信息(<10个同源序列)的nsSNP,我们的方法优于SIFT算法显着。这些发现表明,当没有足够的进化信息时,结合结构信息对于实现良好的预测精度至关重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号