首页> 外文期刊>Psychiatric genetics >Data mining approaches for genome-wide association of mood disorders
【24h】

Data mining approaches for genome-wide association of mood disorders

机译:用于情绪障碍的全基因组关联的数据挖掘方法

获取原文
获取原文并翻译 | 示例
           

摘要

BACKGROUND: Mood disorders are highly heritable forms of major mental illness. A major breakthrough in elucidating the genetic architecture of mood disorders was anticipated with the advent of genome-wide association studies (GWAS). However, to date few susceptibility loci have been conclusively identified. The genetic etiology of mood disorders appears to be quite complex, and as a result, alternative approaches for analyzing GWAS data are needed. Recently, a polygenic scoring approach that captures the effects of alleles across multiple loci was successfully applied to the analysis of GWAS data in schizophrenia and bipolar disorder (BP). However, this method may be overly simplistic in its approach to the complexity of genetic effects. Data mining methods are available that may be applied to analyze the high dimensional data generated by GWAS of complex psychiatric disorders. RESULTS: We sought to compare the performance of five data mining methods, namely, Bayesian networks, support vector machine, random forest, radial basis function network, and logistic regression, against the polygenic scoring approach in the analysis of GWAS data on BP. The different classification methods were trained on GWAS datasets from the Bipolar Genome Study (2191 cases with BP and 1434 controls) and their ability to accurately classify case/control status was tested on a GWAS dataset from the Wellcome Trust Case Control Consortium. CONCLUSION: The performance of the classifiers in the test dataset was evaluated by comparing area under the receiver operating characteristic curves. Bayesian networks performed the best of all the data mining classifiers, but none of these did significantly better than the polygenic score approach. We further examined a subset of single-nucleotide polymorphisms (SNPs) in genes that are expressed in the brain, under the hypothesis that these might be most relevant to BP susceptibility, but all the classifiers performed worse with this reduced set of SNPs. The discriminative accuracy of all of these methods is unlikely to be of diagnostic or clinical utility at the present time. Further research is needed to develop strategies for selecting sets of SNPs likely to be relevant to disease susceptibility and to determine if other data mining classifiers that utilize other algorithms for inferring relationships among the sets of SNPs may perform better.
机译:背景:情绪障碍是重大精神疾病的高度遗传性形式。随着全基因组关联研究(GWAS)的出现,人们有望在阐明情绪障碍的遗传结构方面取得重大突破。然而,迄今为止,几乎没有确定的易感基因座。情绪障碍的遗传病因似乎很复杂,因此,需要用于分析GWAS数据的替代方法。最近,一种捕获跨多个基因座等位基因影响的多基因评分方法已成功应用于精神分裂症和双相情感障碍(BP)中的GWAS数据分析。但是,这种方法在处理遗传效应的复杂性方面可能过于简单。可用数据挖掘方法来分析由复杂精神疾病的GWAS生成的高维数据。结果:我们试图比较五种数据挖掘方法(贝叶斯网络,支持向量机,随机森林,径向基函数网络和逻辑回归)的性能与多基因评分方法在BP上的GWAS数据分析的性能。在双极性基因组研究的GWAS数据集上训练了不同的分类方法(2 191例BP和1434例对照),并在Wellcome Trust病例对照协会的GWAS数据集上测试了它们对病例/对照状态进行准确分类的能力。结论:通过比较接收器工作特性曲线下的面积来评估测试数据集中分类器的性能。贝叶斯网络在所有数据挖掘分类器中表现最好,但没有一个比多基因评分方法明显更好。我们进一步检查了大脑中表达的基因中的单核苷酸多态性(SNP)的子集,假设它们可能与BP易感性最相关,但所有分类器在这套减少的SNP中表现都较差。目前,所有这些方法的判别准确性不太可能具有诊断或临床用途。需要开展进一步的研究来开发策略,以选择可能与疾病易感性有关的SNP集,并确定利用其他算法来推断SNP集之间关系的其他数据挖掘分类器是否可能表现更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号