首页> 外文OA文献 >Mining Pure, Strict Epistatic Interactions from High-Dimensional Datasets: Ameliorating the Curse of Dimensionality
【2h】

Mining Pure, Strict Epistatic Interactions from High-Dimensional Datasets: Ameliorating the Curse of Dimensionality

机译:从高维数据集中挖掘纯净,严格的上位相互作用:缓解维数的诅咒

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Background: The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A difficulty when mining epistatic interactions from high-dimensional datasets concerns the curse of dimensionality. There are too many combinations of SNPs to perform an exhaustive search. A method that could locate strict epistasis without an exhaustive search can be considered the brass ring of methods for analyzing high-dimensional datasets. Methodology/Findings: A SNP pattern is a Bayesian network representing SNP-disease relationships. The Bayesian score for a SNP pattern is the probability of the data given the pattern, and has been used to learn SNP patterns. We identified a bound for the score of a SNP pattern. The bound provides an upper limit on the Bayesian score of any pattern that could be obtained by expanding a given pattern. We felt that the bound might enable the data to say something about the promise of expanding a 1-SNP pattern even when there are no marginal effects. We tested the bound using simulated datasets and semi-synthetic high-dimensional datasets obtained from GWAS datasets. We found that the bound was able to dramatically reduce the search time for strict epistasis. Using an Alzheimer's dataset, we showed that it is possible to discover an interaction involving the APOE gene based on its score because of its large marginal effect, but that the bound is most effective at discovering interactions without marginal effects. Conclusions/Significance: We conclude that the bound appears to ameliorate the curse of dimensionality in high-dimensional datasets. This is a very consequential result and could be pivotal in our efforts to reveal the dark matter of genetic disease risk from high-dimensional datasets. © 2012 Jiang, Neapolitan.
机译:背景:基因座之间影响表型的相互作用称为上位性。如果没有适当的相互作用基因座子集显示边缘效应,则严格上位。对于许多疾病,未知的上位相互作用可能会影响疾病的易感性。从高维数据集中挖掘上位相互作用时的困难涉及维数的诅咒。 SNP的组合太多,无法执行详尽的搜索。可以在不进行详尽搜索的情况下定位严格上位的方法可以被认为是分析高维数据集的方法的黄铜环。方法/发现:SNP模式是代表SNP-疾病关系的贝叶斯网络。 SNP模式的贝叶斯得分是给定模式的数据的概率,已被用来学习SNP模式。我们确定了SNP模式得分的界限。该界限为可以通过扩展给定模式获得的任何模式的贝叶斯分数提供了上限。我们认为,即使在没有边际影响的情况下,边界也可能使数据说出扩大1-SNP模式的希望。我们使用模拟数据集和从GWAS数据集获得的半合成高维数据集测试了边界。我们发现边界能够显着减少严格上位性的搜索时间。使用阿尔茨海默氏症的数据集,我们表明有可能基于其分数发现涉及APOE基因的相互作用,因为它的边际效应很大,但是这种结合最有效地发现了没有边际效应的相互作用。结论/意义:我们得出的结论是,界限似乎减轻了高维数据集中维数的诅咒。这是非常重要的结果,对于我们从高维数据集中揭示遗传疾病风险的暗物质的工作可能至关重要。 ©2012姜,那不勒斯。

著录项

  • 作者

    Jiang X; Neapolitan RE;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号