首页> 外文期刊>BMC Medical Genomics >Bi-stream CNN Down Syndrome screening model based on genotyping array
【24h】

Bi-stream CNN Down Syndrome screening model based on genotyping array

机译:基于基因分型阵列的双流CNN唐氏综合症筛选模型

获取原文
           

摘要

Human Down syndrome (DS) is usually caused by genomic micro-duplications and dosage imbalances of human chromosome 21. It is associated with many genomic and phenotype abnormalities. Even though human DS occurs about 1 per 1,000 births worldwide, which is a very high rate, researchers haven’t found any effective method to cure DS. Currently, the most efficient ways of human DS prevention are screening and early detection. In this study, we used deep learning techniques and analyzed a set of Illumina genotyping array data. We built a bi-stream convolutional neural networks model to screen/predict the occurrence of DS. Firstly, we built image input data by converting the intensities of each SNP site into chromosome SNP maps. Next, we proposed a bi-stream convolutional neural network (CNN) architecture with nine layers and two branch models. We further merged two CNN branch models into one model in the fourth convolutional layer, and output the prediction in the last layer. Our bi-stream CNN model achieved 99.3% average accuracies, and very low false-positive and false-negative rates, which was necessary for further applications in disease prediction and medical practice. We further visualized the feature maps and learned filters from intermediate convolutional layers, which showed the genomic patterns and correlated SNPs variations in human DS genomes. We also compared our methods with other CNN and traditional machine learning models. We further analyzed and discussed the characteristics and strengths of our bi-stream CNN model. Our bi-stream model used two branch CNN models to learn the local genome features and regional patterns among adjacent genes and SNP sites from two chromosomes simultaneously. It achieved the best performance in all evaluating metrics when compared with two single-stream CNN models and three traditional machine-learning algorithms. The visualized feature maps also provided opportunities to study the genomic markers and pathway components associated with Human DS, which provided insights for gene therapy and genomic medicine developments.
机译:人类唐氏综合症(DS)通常是由人类21号染色体的基因组微复制和剂量失衡引起的,它与许多基因组和表型异常有关。尽管全球范围内每1000例婴儿中有1例发生DS,这是非常高的比率,但研究人员尚未找到任何有效的方法来治愈DS。当前,预防人类DS的最有效方法是筛查和早期检测。在这项研究中,我们使用了深度学习技术并分析了一组Illumina基因分型阵列数据。我们建立了一个双流卷积神经网络模型来筛选/预测DS的发生。首先,我们通过将每个SNP位点的强度转换为染色体SNP图来构建图像输入数据。接下来,我们提出了一种具有九层和两个分支模型的双流卷积神经网络(CNN)体系结构。我们在第四卷积层进一步将两个CNN分支模型合并为一个模型,并在最后一层输出预测。我们的双流CNN模型实现了99.3%的平均准确度,并且假阳性和假阴性率非常低,这对于在疾病预测和医学实践中的进一步应用是必需的。我们进一步可视化了特征图并从中间卷积层学到了过滤器,这些过滤器显示了人类DS基因组中的基因组模式和相关SNP变异。我们还将我们的方法与其他CNN和传统机器学习模型进行了比较。我们进一步分析和讨论了我们的双流CNN模型的特点和优势。我们的双流模型使用两个分支CNN模型来同时从两个染色体上学习相邻基因和SNP位点之间的局部基因组特征和区域模式。与两个单流CNN模型和三个传统的机器学习算法相比,它在所有评估指标中均取得了最佳性能。可视化的特征图还提供了研究与人类DS相关的基因组标记和途径成分的机会,这为基因治疗和基因组医学的发展提供了见识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号