首页> 外文期刊>Scientific reports. >Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data
【24h】

Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data

机译:通过对多个基因组数据的整合学习,对外显子组测序研究的非同义单核苷酸变体进行优先级排序

获取原文
           

摘要

The rapid advancement of next generation sequencing technology has greatly accelerated the progress for understanding human inherited diseases via such innovations as exome sequencing. Nevertheless, the identification of causative variants from sequencing data remains a great challenge. Traditional statistical genetics approaches such as linkage analysis and association studies have limited power in analyzing exome sequencing data, while relying on simply filtration strategies and predicted functional implications of mutations to pinpoint pathogenic variants are prone to produce false positives. To overcome these limitations, we herein propose a supervised learning approach, termed snvForest, to prioritize candidate nonsynonymous single nucleotide variants for a specific type of disease by integrating 11 functional scores at the variant level and 8 association scores at the gene level. We conduct a series of large-scale in silico validation experiments, demonstrating the effectiveness of snvForest across 2,511 diseases of different inheritance styles and the superiority of our approach over two state-of-the-art methods. We further apply snvForest to three real exome sequencing data sets of epileptic encephalophathies and intellectual disability to show the ability of our approach to identify causative de novo mutations for these complex diseases. The online service and standalone software of snvForest are found at http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest .
机译:下一代测序技术的飞速发展极大地加快了通过外显子组测序等创新方法了解人类遗传病的进程。然而,从测序数据中鉴定致病变体仍然是巨大的挑战。传统的统计遗传学方法(例如连锁分析和关联研究)在分析外显子组测序数据方面的能力有限,而仅依靠过滤策略和预测突变的功能来查明病原体变体就容易产生假阳性。为了克服这些局限性,我们在此提出一种称为snvForest的监督学习方法,通过在变体水平上整合11个功能评分和在基因水平上整合8个关联评分,对特定类型疾病的候选非同义单核苷酸变异进行优先排序。我们进行了一系列大规模的计算机模拟验证实验,展示了snvForest在2,511种不同遗传样式的疾病中的有效性,以及我们的方法相对于两种最新方法的优越性。我们进一步将snvForest应用于癫痫性脑病和智力残疾的三个实际外显子组测序数据集,以显示我们的方法能够识别这些复杂疾病的致病性新突变的能力。 snvForest的在线服务和独立软件位于http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号