首页> 外文期刊>Proceedings of the National Academy of Sciences of the United States of America >Tree-structured supervised learning and the genetics of hypertension.
【24h】

Tree-structured supervised learning and the genetics of hypertension.

机译:树型监督学习和高血压的遗传学。

获取原文
获取原文并翻译 | 示例
       

摘要

This paper is about an algorithm, FlexTree, for general supervised learning. It extends the binary tree-structured approach (Classification and Regression Trees, CART) although it differs greatly in its selection and combination of predictors. It is particularly applicable to assessing interactions: gene by gene and gene by environment as they bear on complex disease. One model for predisposition to complex disease involves many genes. Of them, most are pure noise; each of the values that is not the prevalent genotype for the minority of genes that contribute to the signal carries a "score." Scores add. Individuals with scores above an unknown threshold are predisposed to the disease. For the additive score problem and simulated data, FlexTree has cross-validated risk better than many cutting-edge technologies to which it was compared when small fractions of candidate genes carry the signal. For the model where only a precise list of aberrant genotypes is predisposing, there is not a systematic patternof absolute superiority; however, overall, FlexTree seems better than the other technologies. We tried the algorithm on data from 563 Chinese women, 206 hypotensive, 357 hypertensive, with information on ethnicity, menopausal status, insulin-resistant status, and 21 loci. FlexTree and Logic Regression appear better than the others in terms of Bayes risk. However, the differences are not significant in the usual statistical sense.
机译:本文介绍了一种用于通用监督学习的算法FlexTree。它扩展了二叉树结构化方法(分类树和回归树,CART),尽管它在选择和组合预测变量方面有很大差异。它特别适用于评估相互作用:基因与基因以及环境与基因之间的相互作用,因为它们影响着复杂的疾病。一种易患复杂疾病的模型涉及许多基因。其中大多数是纯噪声。对于信号的少数基因而言,每个不是普遍基因型的值都带有一个“分数”。分数增加。分数高于未知阈值的个体易患该疾病。对于加性得分问题和模拟数据,FlexTree的交叉验证风险要好于当候选基因的一小部分携带信号时所比较的许多尖端技术。对于仅倾向于准确列出异常基因型列表的模型,没有系统的绝对优势模式。但是,总的来说,FlexTree似乎比其他技术要好。我们对来自563位中国女性,206位低血压,357位高血压的数据进行了算法尝试,并获得了有关种族,更年期状态,胰岛素抵抗状态和21个基因座的信息。就贝叶斯风险而言,FlexTree和Logic Regression显得比其他方法更好。但是,在通常的统计意义上,差异并不明显。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号