...
首页> 外文期刊>Information Processing & Management >Quantifying risk associated with clinical trial termination: A text mining approach
【24h】

Quantifying risk associated with clinical trial termination: A text mining approach

机译:量化与临床试验终止相关的风险:一种文本挖掘方法

获取原文
获取原文并翻译 | 示例

摘要

Clinical trials that terminate prematurely without reaching conclusions raise financial, ethical, and scientific concerns. Scientific studies in all disciplines are initiated with extensive planning and deliberation, often by a team of highly trained scientists. To assure that the quality, integrity, and feasibility of funded research projects meet the required standards, research-funding agencies such as the National Institute of Health and the National Science Foundation, pass proposed research plans through a rigorous peer review process before making funding decisions. Yet, some study proposals successfully pass through all the rigorous scrutiny of the scientific peer review process, but the proposed investigations end up being terminated before yielding results. This study demonstrates an algorithm that quantifies the risk associated with a study being terminated based on the analysis of patterns in the language used to describe the study prior to its implementation. To quantify the risk of termination, we use data from the clinicialTrials.gov repository, from which we extracted structured data that flagged study characteristics, and unstructured text data that described the study goals, objectives and methods in a standard narrative form. We propose an algorithm to extract distinctive words from this unstructured text data that are most frequently used to describe trials that were completed successfully vs. those that were terminated. Binary variables indicating the presence of these distinctive words in trial proposals are used as input in a random forest, along with standard structured data fields. In this paper, we demonstrate that this combined modeling approach yields robust predictive probabilities in terms of both sensitivity (0.56) and specificity (0.71), relative to a model that utilizes the structured data alone (sensitivity = 0.03, specificity = 0.97). These predictive probabilities can be applied to make judgements about a trial's feasibility using information that is available before any funding is granted.
机译:过早终止而未得出结论的临床试验引起了财务,伦理和科学方面的关注。所有学科的科学研究通常都是由一群训练有素的科学家进行的,需要进行广泛的计划和审议。为了确保资助的研究项目的质量,完整性和可行性符合要求的标准,诸如美国国立卫生研究院和美国国家科学基金会等研究资助机构在制定资助决定之前,先通过严格的同行评审程序通过拟议的研究计划。 。然而,一些研究建议书已成功地通过了对科学同行评审过程的所有严格审查,但是所提议的研究最终在产生结果之前被终止。这项研究演示了一种算法,该算法基于对用于执行研究的描述语言的模式分析来量化与终止研究有关的风险。为了量化终止的风险,我们使用了来自ClinicalialTrials.gov存储库的数据,从中我们提取了标记研究特征的结构化数据以及以标准叙述形式描述研究目标,目的和方法的非结构化文本数据。我们提出了一种算法,可以从这种非结构化的文本数据中提取出与众不同的单词,这些单词最常用于描述成功完成的测试与终止的测试。指示试用提案中存在这些独特词的二进制变量与标准结构化数据字段一起用作随机森林中的输入。在本文中,我们证明,相对于仅利用结构化数据的模型(灵敏度= 0.03,特异性= 0.97),这种组合建模方法在灵敏度(0.56)和特异性(0.71)方面均具有鲁棒的预测概率。可以使用这些预测概率,使用在授予任何资金之前可获得的信息来判断试验的可行性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号