首页> 外文会议>IEEE/ACIS International Conference on Computer and Information Science >A computational method for identification of disease-associated non-coding SNPs in human genome
【24h】

A computational method for identification of disease-associated non-coding SNPs in human genome

机译:一种识别人类基因组中与疾病相关的非编码SNP的计算方法

获取原文

摘要

Accurate identification of functionally relevant variants against the ubiquitous background genetic variations is a significant challenge facing bioinformatics researchers and the challenge becomes more severe for non-coding variants. In this study, a novel computational method to identify candidate disease-associated non-coding single nucleotide polymorphisms (SNPs) of human genome is presented. To characterize SNPs, an extensive range of features, such as sequence context, DNA structure, evolutionary conservation and histone modification signals etc. are extracted. Then random forest is adopted to build the classifier model together with an ensemble method to deal with unbalanced data. 10-fold cross-validation result shows that the proposed method can achieve accuracy with the area under ROC curve (AUC) of 0.74. All the original data and the source matlab codes involved are available at https://sourceforge.net/projects/dissnp-predict/.
机译:针对普遍存在的背景遗传变异进行功能相关变异的准确鉴定是生物信息学研究人员面临的重大挑战,对于非编码变异,这一挑战变得更加严峻。在这项研究中,提出了一种新的计算方法来鉴定与人类基因组候选疾病相关的非编码单核苷酸多态性(SNP)。为了表征SNP,提取了广泛的特征,例如序列背景,DNA结构,进化保守性和组蛋白修饰信号等。然后采用随机森林和分类法建立分类器模型,以处理不平衡数据。 10倍交叉验证结果表明,该方法在ROC曲线下面积(AUC)为0.74的情况下可以实现精度。有关所有原始数据和源MATLAB代码,请访问https://sourceforge.net/projects/dissnp-predict/。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号