...
首页> 外文期刊>BMC Bioinformatics >Automatic discovery of 100-miRNA signature for cancer classification using ensemble feature selection
【24h】

Automatic discovery of 100-miRNA signature for cancer classification using ensemble feature selection

机译:使用集合特征选择自动发现用于癌症分类的100-miRNA标记

获取原文
   

获取外文期刊封面封底 >>

       

摘要

MicroRNAs (miRNAs) are noncoding RNA molecules heavily involved in human tumors, in which few of them circulating the human body. Finding a tumor-associated signature of miRNA, that is, the minimum miRNA entities to be measured for discriminating both different types of cancer and normal tissues, is of utmost importance. Feature selection techniques applied in machine learning can help however they often provide naive or biased results. An ensemble feature selection strategy for miRNA signatures is proposed. miRNAs are chosen based on consensus on feature relevance from high-accuracy classifiers of different typologies. This methodology aims to identify signatures that are considerably more robust and reliable when used in clinically relevant prediction tasks. Using the proposed method, a 100-miRNA signature is identified in a dataset of 8023 samples, extracted from TCGA. When running eight-state-of-the-art classifiers along with the 100-miRNA signature against the original 1046 features, it could be detected that global accuracy differs only by 1.4%. Importantly, this 100-miRNA signature is sufficient to distinguish between tumor and normal tissues. The approach is then compared against other feature selection methods, such as UFS, RFE, EN, LASSO, Genetic Algorithms, and EFS-CLA. The proposed approach provides better accuracy when tested on a 10-fold cross-validation with different classifiers and it is applied to several GEO datasets across different platforms with some classifiers showing more than 90% classification accuracy, which proves its cross-platform applicability. The 100-miRNA signature is sufficiently stable to provide almost the same classification accuracy as the complete TCGA dataset, and it is further validated on several GEO datasets, across different types of cancer and platforms. Furthermore, a bibliographic analysis confirms that 77 out of the 100 miRNAs in the signature appear in lists of circulating miRNAs used in cancer studies, in stem-loop or mature-sequence form. The remaining 23 miRNAs offer potentially promising avenues for future research.
机译:微小RNA(miRNA)是与人类肿瘤密切相关的非编码RNA分子,其中很少有分子在人体中循环。找到与肿瘤相关的miRNA签名,即区分不同类型的癌症和正常组织所要测量的最小miRNA实体,至关重要。机器学习中应用的特征选择技术可以提供帮助,但是它们通常会提供幼稚或有偏见的结果。提出了miRNA签名的集成特征选择策略。基于不同类型的高精度分类器对特征相关性的共识,选择miRNA。该方法旨在识别在临床相关的预测任务中使用时更加强大和可靠的签名。使用提出的方法,在从TCGA中提取的8023个样品的数据集中,鉴定出100-miRNA的特征。当运行八个最先进的分类器以及针对原始1046功能的100-miRNA签名时,可以检测到整体精度仅相差1.4%。重要的是,这种100-miRNA标记足以区分肿瘤和正常组织。然后将该方法与其他特征选择方法(例如UFS,RFE,EN,LASSO,遗传算法和EFS-CLA)进行比较。所提出的方法在使用不同分类器进行10倍交叉验证时进行测试时,可提供更高的准确性,并将其应用于跨不同平台的多个GEO数据集,其中一些分类器显示了90%以上的分类准确性,这证明了其跨平台的适用性。 100-miRNA签名足够稳定,可以提供与完整TCGA数据集几乎相同的分类精度,并且可以在不同类型的癌症和平台的多个GEO数据集上得到进一步验证。此外,书目分析证实,签名中的100个miRNA中有77个以茎环或成熟序列的形式出现在癌症研究中使用的循环miRNA列表中。其余的23种miRNA为将来的研究提供了潜在的有希望的途径。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号