【24h】

Emerging Science

机译:新兴科学

获取原文
           

摘要

Evolution of high-throughput technologies has enabled the quantification of several thousands of gene expressions simultaneously. Many oncologic applications have emerged, two of which will be discussed: developing gene signatures and finding molecular targets. A gene signature is a rule to predict patient outcome, usually survival or progression, from the expression of a relatively small number of genes. The scientific community has been frenzied by the proliferation of methods to develop gene signatures. These methods are usually derivatives of statistical regression or machine learning techniques. Despite the fanfare, there is little evidence that increased methodologic sophistication has resulted in substantial improvements in predictive accuracy. This can be explained by the possible abundance of “low-hanging fruit”: There are some, perhaps many, genes that are reasonably good predictors of outcome, and most sensible methods, including simpler ones, will capture some of these genes. The additional genes included in a rule will vary according to the methods used, but will make only small improvements in the overall predictive accuracy. This suggests that the return on investment on sophisticated methodology for making predictions is relatively low. Application of findings that claim good predictive accuracy requires a good deal of care about validation. In any field, initial findings usually get the lion’s share of excitement and credit. Despite widespread agreement that independent validation of these findings is scientifically essential, credible validation can be overlooked. This is especially important in gene signatures: the number of genes is an order of magnitude greater than the number of samples and it is easy to overfit. To ensure that the incremental gains are not sophistry, we should demand careful validation of gene signatures before they are adopted for rout?ne use. While gene signatures are specifically developed for predicting outcome, resourceful scientists have used them for identifying molecular targets. This looks like a free lunch at first: if overexpression of a gene increases the likelihood of progression or death, a molecular intervention suppressing the expression might be a reasonable treatment strategy. The fallacy in this thinking centers around statistical concepts of correlation and causation . Suppose in a simple world there are two genes, X and Y, and the expression of X is the only determinant of outcome O. It also happens that expression of X also derives the expression of Y but the expression of Y has no mechanistic connection on O. This can be depicted with the following diagram: Note the absence of a direct link from Y to O. In this setting one can easily find Y to be a predictor of O because both of them follow from X. As long as the Y is an accurate predictor of O as established on a validation sample there is nothing wrong in using it in practice, despite the fact that mechanistically, Y and O are linked only through X. Nevertheless, it would be incorrect to decide that Y is an appropriate target because modifying Y will have no bearing on O. Now consider a pathway with several genes in a map such as above, and it becomes obvious that there are several possible mechanistic configurations. A gene signature has no reason to uncover these links, but without some understanding of the links it is impossible to identify targets. This reasoning explains another phenomenon that has baffled some oncologists. It is entirely possible to have two (in fact several) gene signatures that have no common genes. Either due to slight variations in methodology or sampling error, different genes may be used to represent the information contained in one set of genes. In other words, gene signatures are hardly unique. This presentation will emphasize the differences in statistical methods used and required for developing gene signatures and identifying molecular targets. As for the former, validation methods should be considered routinely and carefully. Choice of methodology seems to be a less critical factor. Identifying molecular targets will require more careful and focused experimentation than large-sample microarray studies. Gastrointest Cancer Res. 2008 Sep-Oct; 2(5 Suppl 3): S14–S15. ? ABSTR 0829 – Oral Presentation Gastrointest Cancer Res. 2008 Sep-Oct; 2 (5 Suppl 3) : S15. ABSTR 0829 – Oral Presentation Circulating Tumor Cells: A Promising Biomarker in GI Cancers Eunice L. Kwak ,1 Lecia V. Sequist ,1 Sunitha Nagrath ,2 Shyamala Maheswaran ,1 David P. Ryan ,1 Daniel A. Haber ,1 and Mehmet Toner 2 1Massachusetts General Hospital Cancer Center, Boston, MA, USA; 2Massachusetts General Hospital Surgical Services and BioMEMS Resource Center and Shriners Hospital for Children, Boston, MA, USA Author information ? Copyright and License information ? Copyright ? 2008 by the International
机译:高通量技术的发展使得能够同时量化数千种基因表达。已经出现了许多肿瘤学应用,其中将讨论其中的两个:开发基因标记和寻找分子靶标。基因签名是根据相对少量基因的表达来预测患者预后(通常是存活或进展)的规则。科学界对开发基因签名的方法的疯狂疯狂。这些方法通常是统计回归或机器学习技术的衍生物。尽管大张旗鼓,但几乎没有证据表明方法学的先进性已导致预测准确性的显着提高。这可以用“垂头丧气的果实”的可能丰富性来解释:有一些,也许很多,基因可以较好地预测结果,并且大多数明智的方法(包括简单方法)都可以捕获其中一些基因。规则中包括的其他基因将根据所使用的方法而有所不同,但在总体预测准确性上仅会进行很小的改进。这表明用于进行预测的复杂方法的投资回报率相对较低。应用声称具有良好预测准确性的发现结果需要对验证进行大量关注。在任何领域,最初的发现通常都能激发人们的极大兴趣和声誉。尽管人们普遍认为对这些发现进行独立验证在科学上是必不可少的,但是可信的验证却可以被忽略。这在基因签名中尤为重要:基因数量比样本数量大一个数量级,并且很容易过度拟合。为确保增量收益不诡秘,我们应要求仔细验证基因签名,然后再将其用于路由选择。尽管基因签名是专门为预测结果而开发的,但足智多谋的科学家已将它们用于识别分子靶标。乍一看就像免费午餐:如果基因的过表达增加了进展或死亡的可能性,那么抑制表达的分子干预可能是一种合理的治疗策略。这种思维的谬误集中在相关和因果关系的统计概念上。假设在一个简单的世界中,有两个基因X和Y,并且X的表达是结果O的唯一决定因素。同时,X的表达也衍生了Y的表达,但Y的表达在机制上没有机械联系。 O。可以用下图来描述:请注意,没有从Y到O的直接链接。在这种情况下,由于Y和O都跟随X,因此可以轻易地将Y视为O的预测变量。是在验证样本上确定的O的准确预测指标,尽管在机械上Y和O仅通过X链接,但实际上在实践中使用它没有任何错误。但是,确定Y是合适的预测是不正确的目标是因为修饰Y不会影响O。现在考虑一条路线,在上面的图中有几个基因,并且很明显,有几种可能的机制配置。基因签名没有理由揭露这些联系,但如果不了解这些联系,就不可能确定目标。这种推理解释了另一种使某些肿瘤学家感到困惑的现象。完全有可能有两个(实际上是几个)没有共同基因的基因签名。由于方法上的微小变化或采样错误,可以使用不同的基因来表示一组基因中包含的信息。换句话说,基因签名几乎不是唯一的。本演讲将重点介绍用于开发基因签名和识别分子靶标的统计方法的差异。对于前者,应常规和仔细考虑验证方法。方法的选择似乎不是那么关键的因素。与大样本微阵列研究相比,鉴定分子靶标将需要更仔细和集中的实验。胃肠道癌症研究。 2008年9月-10月; 2(5个补充3):S14–S15。 ? ABSTR 0829 –胃肠道癌症研究的口腔演示。 2008年9月-10月; 2(5增补3):S15。 ABSTR 0829 –口服呈递循环肿瘤细胞:胃肠道癌症中的有前途的生物标志物Eunice L. Kwak, 1 Lecia V. Sequist, 1 Sunitha Nagrath, 2 Shyamala Maheswaran, 1 David P. Ryan, 1 Daniel A. Haber, 1 和Mehmet Toner 2 1 马萨诸塞州总医院癌症中心,美国马萨诸塞州波士顿; 2 马萨诸塞州总医院外科服务和BioMEMS资源中心以及美国Shriners儿童医院,美国马萨诸塞州作者信息?版权和许可信息?版权? 2008国际大奖

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号