首页> 外文会议>International Conference on Bioinformatics and Computational Biology >Data Mining Approaches for Genome-Wide Association of Mood Disorders
【24h】

Data Mining Approaches for Genome-Wide Association of Mood Disorders

机译:关于全基因组心情障碍协会的数据挖掘方法

获取原文

摘要

Mood disorders are highly heritable forms of major mental illness. A major breakthrough in elucidating the genetic architecture of mood disorders was anticipated with the advent of genome-wide association studies (GWAS). However, to date few susceptibility loci have been conclusively identified. The genetic etiology of mood disorders appears to be quite complex, and as a result, alternative approaches for analyzing GWAS data are needed. Recently, a polygenic scoring approach that captures the effects of alleles across multiple loci was successfully applied to the analysis of GWAS data in schizophrenia and bipolar disorder (BP). However, this method may be overly simplistic in its approach to the complexity of genetic effects. Data mining methods are available that may be applied to analyze the high dimensional data generated by GWAS of complex psychiatric disorders. We sought to compare the performance of three data mining methods, namely, Bayesian Networks (BN), Support Vector Machine (SVM), and Logistic Regression (LR), against the polygenic scoring approach in the analysis of GWAS data on BP. The different classification methods were trained on GWAS datasets from the Bipolar Genome Study (2,191 cases with BP and 1,434 controls) and their ability to accurately classify case/control status was tested on a GWAS dataset from the Wellcome Trust Case Control Consortium. The performance of the classifiers in the test dataset was evaluated by comparing area under the receiver operating characteristic curves (AUC). BN performed the best of all the data mining classifiers, but none of these did significantly better than the polygenic score approach. We further examined a subset of SNPs in genes that are expressed in the brain, under the hypothesis that these might be most relevant to BP susceptibility, but all the classifiers performed worse with this reduced set of SNPs. The discriminative accuracy of all of these methods is unlikely to be of diagnostic or clinical utility at the present time. Further research is needed to develop strategies for selecting sets of SNPs likely to be relevant to disease susceptibility and to determine if other data mining classifiers that utilize other algorithms for inferring relationships among the sets of SNPs may perform better.
机译:情绪障碍是一种高度繁殖形式的主要精神疾病。阐明了阐明了情绪障碍遗传建筑的重大突破预计会有基因组 - 宽协会研究(GWAS)的出现。但是,迄今为止,很少有易感性基因座已经得出结正确定。情绪障碍的遗传病程似乎是相当复杂的,因此需要进行分析GWAS数据的替代方法。最近,成功地应用于分析精神分裂症和双相障碍(BP)的GWAS数据的分析来捕获多个基因座跨越多个基因座的影响的多基因评分方法。然而,这种方法可能在其对遗传效应的复杂性的方法中过度简化。可以应用数据挖掘方法,以分析由复杂精神疾病的GWA产生的高尺寸数据。我们寻求比较三种数据挖掘方法的性能,即贝叶斯网络(BN),支持向量机(SVM)和Logistic回归(LR),在对BP上分析GWAS数据的分析中的多基因评分方法。不同的分类方法在来自双极基因组研究的GWAS数据集上培训(BP和1,434个控制2,191例),并且在惠康信托案控制联盟的GWAS数据集上测试了精确分类案例/控制状态的能力。通过比较接收器操作特征曲线(AUC)下的区域来评估测试数据集中的分类器的性能。 BN执行了所有数据挖掘分类器的最佳,但这些都不明显优于多基因评分方法。我们进一步研究了在大脑中表达的基因中的SNP的子集,在这些可能与BP易感性最相关的假设下,但所有分类器的所有分类器都与这种减少的SNP进行了更差。目前,所有这些方法的辨别准确性不太可能是诊断或临床效用。需要进一步研究以制定用于选择与疾病易感性相关的SNPS组的策略,并确定用于使用其他算法用于推断SNP组中的关系的其他数据挖掘分类器可以更好地执行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号