首页> 美国卫生研究院文献>Journal of the Endocrine Society >SAT-LB111 Improving Classification of Diabetes Etiology in Electronic Resources Using Phenotype Algorithms and Polygenic Risk Scores
【2h】

SAT-LB111 Improving Classification of Diabetes Etiology in Electronic Resources Using Phenotype Algorithms and Polygenic Risk Scores

机译:SAT-LB111使用表型算法和多基因风险分数改善电子资源中糖尿病病因的分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Electronic Health Records (EHR) contain rich data to identify and study diabetes. Many phenotype algorithms have been developed to identify research subjects with type 2 diabetes (T2D), but very few accurately identify type 1 diabetes (T1D) cases or more rare forms of monogenic and atypical metabolic presentations. Polygenetic risk scores (PRS) quantify risk of a disease using common genomic variants well for both T1D and T2D. In this study, we apply validated phenotyping algorithms to EHRs linked to a genomic biobank to understand the independent contribution of PRS to classification of diabetes etiology and generate additional novel markers to distinguish subtypes of diabetes in EHR data. Using a de-identified mirror of medical center’s electronic health record, we applied published algorithms for T1D and T2D to identify cases, and used natural language processing and chart review strategies to identify cases of maturity onset diabetes of the young (MODY) and other more rare presentations. This novel approach included additional data types such as medication sequencing, ratio and temporality of insulin and non-insulin agents, clinical genetic testing, and ratios of diagnostic codes. Chart review was performed to validate etiology. To calculate PRS, we used genome wide genotyping from our BioBank, the de-identified biobank linking EHR to genomic data using coefficients of 65 published T1D SNPS and 76,996 T2D SNPS using PLINK in Caucasian subjects. In the dataset, we identified 82,238 cases of T2D but only 130 cases of T1D using the most cited published algorithms. Adding novel structured elements and natural language processing identified an additional 138 cases of T1D and distinguished 354 cases as MODY. Among over 90,000 subjects with genotyping data available, we included 72,624 Caucasian subjects since PRS coefficients were generated in Caucasian cohorts. Among those subjects, 248, 6,488, and 21 subjects were identified as T1D, T2D, and MODY subjects respectively in our final PRS cohort. The T1D PRS did significantly discriminate well between cases and controls (Mann-Whitney p-value is 3.4 e-17). The PRS for T2D did not significantly discriminate between cases and controls using published algorithms. The atypical case count was too low to calculate PRS discrimination. Calculation of the PRS score was limited by quality inclusion of variants available, and discrimination may improve in larger data sets. Additionally, blinded physician case review is ongoing to validate the novel classification scheme and provide a gold standard for machine learning approaches that can be applied in validation sets.
机译:电子健康记录(EHR)包含丰富的数据以识别和研究糖尿病。已经开发了许多表型算法以鉴定具有2型糖尿病(T2D)的研究受试者,但很少准确地鉴定1型糖尿病(T1D)病例或更罕见的单一的单一和非典型代谢介绍。多种基因风险评分(PRS)对T1D和T2D的常见基因组变体量阱量化疾病的风险。在这项研究中,我们将验证的表型算法应用于与基因组生物人物相关的EHR,以了解PRS对糖尿病病因分类的独立贡献,并产生额外的新型标记,以区分EHR数据中糖尿病亚型。使用医疗中心的电子健康记录的De-Identified镜像,我们应用了T1D和T2D的发布算法,以识别案例,并使用自然语言处理和图表审查策略,以确定年轻(堪称)的成熟型糖尿病案例罕见的演示。这种新方法包括额外的数据类型,例如胰岛素和非胰岛素剂的药物测序,比率和暂时性,临床遗传学检测和诊断码的比例。图表审查进行了验证病因。为了计算PRS,我们使用来自我们的BioBank的基因组宽基因分型,使用35发表的T1D SNP和76,996 T2D SNP的系数将EHR与基因组数据连接到基因组数据中使用PLINK在白种人对象中使用PLINK。在数据集中,我们确定了82,238例T2D,但使用最引用的公布算法仅为130例T1D。添加新颖的结构化元素和自然语言处理确定了另外138例T1D,并为354个案例作为态度。在有可用的基因分型数据的超过90,000个受试者中,我们包括72,624个白种人对象,因为PRS系数在白种人队列中产生。在这些受试者中,248,6,488和21个受试者分别在我们的最终PRS队列中鉴定为T1D,T2D和Mody受试者。 T1D PRS在病例和控制之间进行了很大歧视(Mann-Whitney P值为3.4 E-17)。使用已发布的算法,T2D的PRS没有显着区分病例和控制。非典型案例计数太低而无法计算PRS歧视。 PRS评分的计算受可用变体的质量纳入的限制,并且歧视可能改善较大的数据集。此外,盲目的医师案例审查正在进行验证新颖的分类方案,并为可在验证集中应用的机器学习方法提供金标准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号