首页> 外文期刊>plos computational biology >Improve the model of disease subtype heterogeneity by leveraging external summary data
【24h】

Improve the model of disease subtype heterogeneity by leveraging external summary data

机译:通过利用外部汇总数据改进疾病亚型异质性模型

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Researchers are often interested in understanding the disease subtype heterogeneity by testing whether a risk exposure has the same level of effect on different disease subtypes. The polytomous logistic regression (PLR) model provides a flexible tool for such an evaluation. Disease subtype heterogeneity can also be investigated with a case-only study that uses a case-case comparison procedure to directly assess the difference between risk effects on two disease subtypes. Motivated by a large consortium project on the genetic basis of non-Hodgkin lymphoma (NHL) subtypes, we develop PolyGIM, a procedure to fit the PLR model by integrating individual-level data with summary data extracted from multiple studies under different designs. The summary data consist of coefficient estimates from working logistic regression models established by external studies. Examples of the working model include the case-case comparison model and the case-control comparison model, which compares the control group with a subtype group or a broad disease group formed by merging several subtypes. PolyGIM efficiently evaluates risk effects and provides a powerful test for disease subtype heterogeneity in situations when only summary data, instead of individual-level data, is available from external studies due to various informatics and privacy constraints. We investigate the theoretic properties of PolyGIM and use simulation studies to demonstrate its advantages. Using data from eight genome-wide association studies within the NHL consortium, we apply it to study the effect of the polygenic risk score defined by a lymphoid malignancy on the risks of four NHL subtypes. These results show that PolyGIM can be a valuable tool for pooling data from multiple sources for a more coherent evaluation of disease subtype heterogeneity. Author summaryResearchers usually classify a disease condition into subtypes with different progression patterns and treatment responses. Multiple studies often investigate a complex disease, but not all of them consider the same set of subtypes. In addition, due to various informatics and privacy constraints, it can be challenging to pool individual data across all studies for more efficient analyses. On the other hand, summarized data, such as those generated from genetic association studies, can be easily accessed. We develop PolyGIM, a flexible statistical framework to integrate detailed individual-level data with summary data from multiple sources to comprehensively assess the risk effect on different disease subtypes. We use PolyGIM to understand the genetic basis underlying four major non-Hodgkin lymphoma subtypes.
机译:研究人员通常有兴趣通过测试风险暴露是否对不同疾病亚型具有相同水平的影响来了解疾病亚型的异质性。多卷逻辑回归 (PLR) 模型为此类评估提供了一种灵活的工具。疾病亚型异质性也可以通过仅病例研究来研究,该研究使用病例-病例比较程序直接评估对两种疾病亚型的风险影响之间的差异。在一个关于非霍奇金淋巴瘤 (NHL) 亚型遗传基础的大型联盟项目的推动下,我们开发了 PolyGIM,这是一种通过将个体水平数据与从不同设计下的多项研究中提取的汇总数据相结合来拟合 PLR 模型的程序。汇总数据由外部研究建立的工作逻辑回归模型的系数估计值组成。工作模型的示例包括病例-病例比较模型和病例-对照比较模型,后者将对照组与亚型组或由合并几种亚型形成的广泛疾病组进行比较。PolyGIM 有效地评估风险影响,并在由于各种信息学和隐私限制而只能从外部研究中获得汇总数据而不是个人水平数据的情况下,为疾病亚型异质性提供强大的测试。我们研究了PolyGIM的理论性质,并使用仿真研究来证明其优势。使用来自 NHL 联盟内八项全基因组关联研究的数据,我们将其应用于研究由淋巴恶性肿瘤定义的多基因风险评分对四种 NHL 亚型风险的影响。这些结果表明,PolyGIM可以成为汇集来自多个来源的数据的宝贵工具,以便对疾病亚型异质性进行更连贯的评估。作者摘要研究人员通常将疾病分类为具有不同进展模式和治疗反应的亚型。多项研究经常调查一种复杂的疾病,但并非所有研究都考虑同一组亚型。此外,由于各种信息学和隐私限制,将所有研究的个人数据汇集起来以实现更有效的分析可能具有挑战性。另一方面,可以很容易地访问汇总数据,例如从遗传关联研究中产生的数据。我们开发了PolyGIM,这是一个灵活的统计框架,将详细的个人水平数据与来自多个来源的汇总数据相结合,以全面评估对不同疾病亚型的风险影响。我们使用 PolyGIM 来了解四种主要非霍奇金淋巴瘤亚型的遗传基础。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号