首页> 美国卫生研究院文献>Cancers >Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification
【2h】

Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification

机译:基于机器学习的集合递归特征选择循环miRNA用于癌症肿瘤分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Circulating microRNAs (miRNA) are small noncoding RNA molecules that can be detected in bodily fluids without the need for major invasive procedures on patients. miRNAs have shown great promise as biomarkers for tumors to both assess their presence and to predict their type and subtype. Recently, thanks to the availability of miRNAs datasets, machine learning techniques have been successfully applied to tumor classification. The results, however, are difficult to assess and interpret by medical experts because the algorithms exploit information from thousands of miRNAs. In this work, we propose a novel technique that aims at reducing the necessary information to the smallest possible set of circulating miRNAs. The dimensionality reduction achieved reflects a very important first step in a potential, clinically actionable, circulating miRNA-based precision medicine pipeline. While it is currently under discussion whether this first step can be taken, we demonstrate here that it is possible to perform classification tasks by exploiting a recursive feature elimination procedure that integrates a heterogeneous ensemble of high-quality, state-of-the-art classifiers on circulating miRNAs. Heterogeneous ensembles can compensate inherent biases of classifiers by using different classification algorithms. Selecting features then further eliminates biases emerging from using data from different studies or batches, yielding more robust and reliable outcomes. The proposed approach is first tested on a tumor classification problem in order to separate 10 different types of cancer, with samples collected over 10 different clinical trials, and later is assessed on a cancer subtype classification task, with the aim to distinguish triple negative breast cancer from other subtypes of breast cancer. Overall, the presented methodology proves to be effective and compares favorably to other state-of-the-art feature selection methods.
机译:循环微小RNA(miRNA)是小型非编码RNA分子,其可以在体液中检测,而无需对患者的主要侵入性程序。 MIRNAS作为肿瘤的生物标志物表现出巨大的希望,以评估其存在并预测其类型和亚型。最近,由于MiRNA数据集的可用性,机器学习技术已成功应用于肿瘤分类。然而,结果难以通过医学专家评估和解释,因为算法利用来自数千名MiRNA的信息。在这项工作中,我们提出了一种新颖的技术,旨在将必要的信息减少到最小可能的循环miRNA。达到的维数减少反映了潜在,临床可操作,循环的miRNA的精确药管道非常重要的第一步。虽然目前正在讨论该第一步是否可以讨论,但我们在此示出了通过利用集成高质量,最先进的分类器的异构集合的递归特征消除程序,可以执行分类任务在循环miRNA。通过使用不同的分类算法,异构集合可以补偿分类器的固有偏差。选择特征然后进一步消除了使用来自不同研究或批次的数据的偏置,从而产生更强大和可靠的结果。第一次在肿瘤分类问题上测试了所提出的方法,以分离10种不同类型的癌症,其中包含超过10种不同的临床试验,后来对癌症亚型分类任务进行评估,目的是区分三重阴性乳腺癌来自其他乳腺癌亚型。总的来说,所提出的方法证明是有效的,并比较到其他最先进的特征选择方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号