首页> 外文会议>Canadian conference on artificial intelligence >Learning Disease Patterns from High-Throughput Genomic Profiles: Why Is It So Challenging?
【24h】

Learning Disease Patterns from High-Throughput Genomic Profiles: Why Is It So Challenging?

机译:从高吞吐学基因组概况学习疾病模式:为什么它如此挑战?

获取原文

摘要

In the 20th century, genetic scientists anticipated that shortly after availability of the whole-genome profiling technologies, the patterns of complex diseases would be decoded easily. However, we recently found it extremely difficult to predict women's susceptibility to breast cancer based on their germline genomic profiles and achieved an accuracy of 59.55% over the baseline of 51.52% after applying a wide variety of biologically-naive and biologically-informed feature selection and supervised learning methods. By contrast, in a separate study, we showed that we can utilize these genomic profiles to accurately predict ancestral origins of individuals. While there are biomedical explanations of accurate predictability of an individual's ancestral roots and poor predictability of her susceptibility to breast cancer, my research attempts to utilize the computational learning theory framework to explain what concepts are learnable, based on the three common characteristics of biomedical datasets: the high dimensionality, the label heterogeneity, and the noise.
机译:在20世纪,遗传科学家预计在全基因组分析技术的可用性后不久,复杂疾病的模式将容易地解码。然而,我们最近发现基于其种系基因组谱预测妇女对乳腺癌的易感性非常困难,并且在应用各种各样的生物学 - 天真和生物学上通知的特征选择后,在51.52%的基线上实现了59.55%的准确度。监督学习方法。相比之下,在一个单独的研究中,我们表明我们可以利用这些基因组曲线来准确地预测个人的祖先起源。虽然存在对个体祖先根系的准确可预测性的生物医学解释和对乳腺癌易感性的可预测性,但我的研究试图利用计算学习理论框架来解释基于生物医学数据集的三个共同特征的学习概念:高维度,标签异质性和噪音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号