首页> 外文期刊>Journal of Translational Medicine >A machine learning-based clinical tool for diagnosing myopathy using multi-cohort microarray expression profiles
【24h】

A machine learning-based clinical tool for diagnosing myopathy using multi-cohort microarray expression profiles

机译:一种基于机器学习的临床工具,用于使用多队核对微阵列表达式概况诊断肌病

获取原文
获取外文期刊封面目录资料

摘要

Myopathies are a heterogenous collection of disorders characterized by dysfunction of skeletal muscle. In practice, myopathies are frequently encountered by physicians and precise diagnosis remains a challenge in primary care. Molecular expression profiles show promise for disease diagnosis in various pathologies. We propose a novel machine learning-based clinical tool for predicting muscle disease subtypes using multi-cohort microarray expression data. Muscle tissue samples originating from 1260 patients with muscle weakness. Data was curated from 42 independent cohorts with expression profiles in public microarray gene expression repositories, which represent a broad range of patient ages and peripheral muscles. Cohorts were categorized into five muscle disease subtypes: immobility, inflammatory myopathies, intensive care unit acquired weakness (ICUAW), congenital, and chronic systemic disease. The data contains expression data on 34,099 genes. Data augmentation techniques were used to address class imbalances in the muscle disease subtypes. Support vector machine (SVM) models were trained on two-thirds of the 1260 samples based on the top selected gene signature using analysis of variance (ANOVA). The model was validated in the remaining samples using area under the receiver operator curve (AUC). Gene enrichment analysis was used to identify enriched biological functions in the gene signature. The AUC ranges from 0.611 to 0.649 in the observed imbalanced data. Overall, using the augmented data, chronic systemic disease was the best predicted class with AUC 0.872 (95% confidence interval (CI): 0.824–0.920). The least discriminated classes were ICUAW with AUC 0.777 (95% CI: 0.668–0.887) and immobility with AUC 0.789 (95% CI: 0.716–0.861). Disease-specific gene set enrichment results showed that the gene signature was enriched in biological processes including neural precursor cell proliferation for ICUAW?and aerobic respiration for congenital (false discovery rate q-value??0.001). Our results present a well-performing molecular classification tool with the selected gene markers for muscle disease classification. In practice, this tool addresses an important gap in the literature on myopathies and presents a potentially useful clinical tool for muscle disease subtype diagnosis.
机译:肌病是一种异源性的疾病,其特征是骨骼肌功能障碍。在实践中,医生经常遇到肌病,精确的诊断仍然是初级保健的挑战。分子表达谱显示出疾病诊断在各种病理中的承诺。我们提出了一种新颖的基于机器学习的临床工具,用于预测使用多群微阵列表达数据预测肌肉疾病亚型。肌肉组织样品源自1260例肌肉无力的患者。数据从42个独立的群组策划,具有在公共微阵列基因表达式储存库中的表达曲线,其代表广泛的患者年龄和外周肌。将群组分为五个肌肉疾病亚型:不动,炎症性肌病,重症监护单位获得弱点(icuaw),先天性和慢性全身疾病。数据包含34,099个基因的表达数据。数据增强技术用于解决肌肉疾病亚型中的阶级失衡。支持向量机(SVM)模型在使用方差分析(ANOVA)的基础上,基于顶部选定的基因签名培训了1260个样本的三分之二。使用接收器操作员曲线(AUC)下的区域在剩余的样本中验证了该模型。基因富集分析用于鉴定基因签名中的富集的生物学功能。 AUC在观察到的不平衡数据中的0.611至0.649范围为0.611至0.649。总体而言,使用增强数据,慢性全身性疾病是AUC 0.872(95%置信区间(CI):0.824-0.920)的最佳预测类。最少的歧视类别是AUC 0.777(95%CI:0.668-0.887)和AUC 0.789的不动(95%CI:0.716-0.861)。疾病特异性基因设定富集结果表明,基因签名富集在生物过程中,包括icuAW的神经前体细胞增殖,以及先天性的有氧呼吸(假发现率Q值?<?0.001)。我们的结果表现出良好的分子分类工具,具有用于肌肉疾病分类的所选基因标志物。在实践中,该工具解决了文献中的肌病文献中的一个重要差异,并为肌病亚型诊断提出了一个潜在有用的临床工具。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号