首页> 外文会议>International Conference on Data Management, Analytics and Innovation >Comparative Study of Machine Learning Models to Classify Gene Variants of ClinVar
【24h】

Comparative Study of Machine Learning Models to Classify Gene Variants of ClinVar

机译:机器学习模型对Clinvar基因变种进行分类的比较研究

获取原文

摘要

Single-nucleotide polymorphisms (SNPs) are variants that occur in every person's genome. SNP is basically a difference in nucleotide replaced with another nucleotide SNP that is acting as biomarkers in disease prediction by locating genes that are associated. ClinVar is a public archive of reports with human variants, which are SNPs with supporting evidence. Variants reported by laboratories with different classifications like pathogenic, benign, and uncertain significance for the same variant in a gene may cause confusion when clinicians or researchers try to interpret the impact on the disease of a given patient. It is important to classify which SNPs reported have conflicting classifications and which are not. This paper presents the comparative performance analysis of machine learning algorithms that are applied to classify gene variants on a dataset of 65,188 records taken from Kaggle. We found and concluded that the neural networks model shows some bias in classifying the gene variants. In this scenario, the random forest classifier performs well in classifying gene variants by satisfying all the accuracy parameters.
机译:单核苷酸多态性(SNP)是在每个人的基因组中发生的变体。 SNP基本上是用另一种核苷酸SNP所取代的核苷酸的差异,其通过定位相关的基因作为疾病预测中的生物标志物。 Clinvar是具有人类变体的报告的公共档案,这些报告是具有支持证据的SNP。实验室报告的变体具有不同的分类,如致病性,良性和对基因相同变体的不确定意义,当临床医生或研究人员试图解释对给定患者的疾病的影响时可能会引起混淆。重要的是分类报告的SNP具有相互冲突的分类,哪些不是。本文介绍了应用机器学习算法的比较性能分析,用于对从kaggle采取的65,188条记录的数据集进行分类基因变体。我们发现并得出结论,神经网络模型在分类基因变体中表现出一些偏差。在这种情况下,随机林分类器通过满足所有精度参数来对基因变体进行良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号