首页> 外文会议>Canadian conference on artificial intelligence >Learning Disease Patterns from High-Throughput Genomic Profiles: Why Is It So Challenging?
【24h】

Learning Disease Patterns from High-Throughput Genomic Profiles: Why Is It So Challenging?

机译:从高通量基因组概况学习疾病模式:为什么如此具有挑战性?

获取原文

摘要

In the 20th century, genetic scientists anticipated that shortly after availability of the whole-genome profiling technologies, the patterns of complex diseases would be decoded easily. However, we recently found it extremely difficult to predict women's susceptibility to breast cancer based on their germline genomic profiles and achieved an accuracy of 59.55% over the baseline of 51.52% after applying a wide variety of biologically-naive and biologically-informed feature selection and supervised learning methods. By contrast, in a separate study, we showed that we can utilize these genomic profiles to accurately predict ancestral origins of individuals. While there are biomedical explanations of accurate predictability of an individual's ancestral roots and poor predictability of her susceptibility to breast cancer, my research attempts to utilize the computational learning theory framework to explain what concepts are learnable, based on the three common characteristics of biomedical datasets: the high dimensionality, the label heterogeneity, and the noise.
机译:在20世纪,遗传学家预计,在使用全基因组谱分析技术后不久,复杂疾病的模式将很容易被解码。但是,我们最近发现,很难根据女性的种系基因组概况来预测其对乳腺癌的易感性,并且在应用了多种具有生物学天赋和生物学信息的特征选择后,其准确率比基线的51.52%达到了59.55%。有监督的学习方法。相比之下,在另一项研究中,我们表明我们可以利用这些基因组图谱来准确预测个体的祖先起源。虽然有生物医学的解释可以准确地预测个体的祖先根源,而其对乳腺癌的易感性则无法预测,但是我的研究尝试基于生物医学数据集的三个共同特征,利用计算学习理论框架来解释哪些概念是可学的:高维度,标签异质性和噪音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号