首页> 外文期刊>Cancer Informatics >Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics
【24h】

Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics

机译:计算分子签名作为双目标函数的最优:方法和在肿瘤基因组学预测中的应用

获取原文
           

摘要

Background: Filter feature selection methods compute molecular signatures by selecting subsets of genes in the ranking of a valuation function. The motivations of the valuation functions choice are almost always clearly stated, but those for selecting the genes according to their ranking are hardly ever explicit.Method: We addressed the computation of molecular signatures by searching the optima of a bi-objective function whose solution space was the set of all possible molecular signatures, ie, the set of subsets of genes. The two objectives were the size of the signature–to be minimized–and the interclass distance induced by the signature–to be maximized–.Results: We showed that: 1) the convex combination of the two objectives had exactly n optimal non empty signatures where n was the number of genes, 2) the n optimal signatures were nested, and 3) the optimal signature of size k was the subset of k top ranked genes that contributed the most to the interclass distance. We applied our feature selection method on five public datasets in oncology, and assessed the prediction performances of the optimal signatures as input to the diagonal linear discriminant analysis (DLDA) classifier. They were at the same level or better than the best-reported ones. The predictions were robust, and the signatures were almost always significantly smaller. We studied in more details the performances of our predictive modeling on two breast cancer datasets to predict the response to a preoperative chemotherapy: the performances were higher than the previously reported ones, the signatures were three times smaller (11 versus 30 gene signatures), and the genes member of the signature were known to be involved in the response to chemotherapy.Conclusions: Defining molecular signatures as the optima of a bi-objective function that combined the signature size and the interclass distance was well founded and efficient for prediction in oncogenomics. The complexity of the computation was very low because the optimal signatures were the sets of genes in the ranking of their valuation. Software can be freely downloaded from http://gardeux-vincent.eu/DeltaRanking.php
机译:背景:过滤器特征选择方法通过在评估函数的排名中选择基因的子集来计算分子特征。评估函数选择的动机几乎总是很清楚地说明,但是根据基因排名对基因进行选择的动机却几乎是不明确的。方法:通过搜索双目标函数的最优解来解决分子签名的计算问题,该双目标函数的解空间是所有可能的分子标记的集合,即基因子集的集合。这两个目标分别是签名的大小(要最小化)和签名引起的类间距离(要最大化)。结果:我们证明:1)两个目标的凸组合恰好具有n个最优的非空签名其中n是基因数,2)嵌套了n个最佳特征,3)大小为k的最佳特征是k个排名最高的基因的子集,其对类间距离的贡献最大。我们将特征选择方法应用于五个肿瘤学公共数据集,并评估了最佳特征的预测性能,作为对角线线性判别分析(DLDA)分类器的输入。它们处于相同水平或比报告最好的水平更好。预测是可靠的,签名几乎总是很小。我们更详细地研究了我们在两个乳腺癌数据集上的预测模型的性能,以预测对术前化疗的反应:性能比以前报道的要高,签名减少了三倍(11对30个基因签名),并且结论:将分子标记定义为结合了标记大小和类间距离的双目标功能的最佳方法,可以很好地确定分子标记,并且可以有效地预测肿瘤基因组学。计算的复杂度非常低,因为最佳特征是其评估等级中的基因集。可以从http://gardeux-vincent.eu/DeltaRanking.php免费下载软件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号