首页> 美国卫生研究院文献>American Journal of Human Genetics >Mantis-ml: Disease-Agnostic Gene Prioritization from High-Throughput Genomic Screens by Stochastic Semi-supervised Learning
【2h】

Mantis-ml: Disease-Agnostic Gene Prioritization from High-Throughput Genomic Screens by Stochastic Semi-supervised Learning

机译:Mantis-ML:随机半监督学习从高通量基因组筛网的疾病 - 无症基因优先级

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Access to large-scale genomics datasets has increased the utility of hypothesis-free genome-wide analyses. However, gene signals are often insufficiently powered to reach experiment-wide significance, triggering a process of laborious triaging of genomic-association-study results. We introduce mantis-ml, a multi-dimensional, multi-step machine-learning framework that allows objective assessment of the biological relevance of genes to disease studies. Mantis-ml is an automated machine-learning framework that follows a multi-model approach of stochastic semi-supervised learning to rank disease-associated genes through iterative learning sessions on random balanced datasets across the protein-coding exome. When applied to a range of human diseases, including chronic kidney disease (CKD), epilepsy, and amyotrophic lateral sclerosis (ALS), mantis-ml achieved an average area under curve (AUC) prediction performance of 0.81–0.89. Critically, to prove its value as a tool that can be used to interpret exome-wide association studies, we overlapped mantis-ml predictions with data from published cohort-level association studies. We found a statistically significant enrichment of high mantis-ml predictions among the highest-ranked genes from hypothesis-free cohort-level statistics, indicating a substantial improvement over the performance of current state-of-the-art methods and pointing to the capture of true prioritization signals for disease-associated genes. Finally, we introduce a generic mantis-ml score (GMS) trained with over 1,200 features as a generic-disease-likelihood estimator, outperforming published gene-level scores. In addition to our tool, we provide a gene prioritization atlas that includes mantis-ml’s predictions across ten disease areas and empowers researchers to interactively navigate through the gene-triaging framework. Mantis-ml is an intuitive tool that supports the objective triaging of large-scale genomic discovery studies and enhances our understanding of complex genotype-phenotype associations.
机译:进入大规模基因组学数据集增加了自由设定,全基因组分析的效用。然而,基因信号常常不充分供电以达到实验范围内的显着性,引发的基因组关联学习结果费力优先分配的处理。我们引进螳螂毫升,多维度,多步骤的机器学习框架,使基因病研究的生物学相关的客观评价。螳螂毫升是一种自动化机器学习框架,如下通过在整个蛋白质编码外显子组随机数据集均衡迭代学习会话随机半监督学习的多模型的方法来秩疾病相关的基因。当施加到一个范围的人类疾病,包括慢性肾脏疾病(CKD),癫痫症和肌萎缩性侧索硬化症(ALS)的,螳螂毫升下的曲线0.81-0.89(AUC)的预测性能来实现的平均面积。重要的是,以证明其为可用于解释外显子组范围关联研究的工具价值,我们重叠,从公布的队列级别的相关研究数据螳螂毫升的预测。我们发现从自由设定的队列级统计排名最高的基因中的螳螂毫升预测的统计显著富集,表明在国家的最先进的现有方法的性能有显着改善,并指着捕获真正的优先信号,用于疾病相关的基因。最后,我们介绍了与1200的功能作为一种通用的疾病似然估计训练的通用螳螂毫升​​的分数(GMS),跑赢公布的基因层次的分数。除了我们的工具,我们提供了一个基因排序图谱,包括螳螂毫升的在10个疾病领域并赋权研究人员预测,以通过基因检伤分类框架交互导航。螳螂毫升是一个直观的工具,支持大规模的基因组发现研究的目标分类,并将增强了我们复杂的基因型 - 表型协会的理解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号