首页> 外文会议>Asia-pacific bioinformatics conference >Biomarker rediscovery using Random Forests-based gene selection from microarray data of colon cancer
【24h】

Biomarker rediscovery using Random Forests-based gene selection from microarray data of colon cancer

机译:生物标志物使用基于随机森林的基因选择从结肠癌微阵列数据进行重新发现

获取原文

摘要

Background: After DNA microarray technique first described in 1995 and quickly adopted by the research community, it brings great convenience to our studies on identifying biologic markers, classifying types of tumours, predicting the outcome and response to chemotherapy.Although researchers have proposed a lot of approaches to deal with the data come from microarray, it is still a challenge to select feature genes containing most information about the diseases, as well as better understand underlying biological phenomena. Random Forests (referred as RF following) is an important method of classification and regression and can return feature gene set meanwhile.Additionally, to gain insight into biological phenomena, ways taking account of not merely single-gene level but also multi-gene level can be used, e.g., pathway analysis and gene ontology (referred to as GO following). These methods may lead to a better biological interpretation and supply us a method exploring feature genes independent of data.Results: In this paper, we selected 60 genes finally, which produced the smallest OOB error rate with RF. Furthermore, by using certain pathway and gene ontology databases, most of the genes selected by our approach were confirmed to have notable association with the formation and progression of various kinds of human cancer.Conclusions: We adopt a RF-based method of feature gene selection incorporating with backward eliminating thought to process the data of colon cancer and finally identified 60 genes which can better classify tumour and normal samples. For the purpose of biological interpretation and verifying these genes,pathway analysis and gene functional annotation were also referred to. The results showed that our approach was available for identifying feature genes and precise for molecular classification of colon tumour.
机译:背景:在1995年首次描述,并迅速被研究界采用DNA芯片技术后,在确定生物标记,分类类型的肿瘤,预测结果和应对chemotherapy.Although研究人员提出了很多带来极大的方便我们的研究方法来处理数据来自芯片,但它仍然是选择包含有关的疾病有信息功能的基因,以及更好地了解潜在的生物学现象的挑战。随机森林(称为RF以下)是分类和回归的一个重要方法,并且可以返回meanwhile.Additionally特征基因组,以深入了解生物现象,采取的方式不仅单基因水平,而且多基因水平罐的帐户被使用,例如,路径分析和基因本体论(简称GO以下)。这些方法可能会导致更好的生物学解释和提供给我们探索功能基因独立data.Results的方法:在本文中,我们选择了60个基因。最后,产生与RF最小OOB错误率。此外,通过使用特定的途径和基因本体数据库,最受我们的方法选择的基因被证实具有形成与各种人类cancer.Conclusions的进展显着的关联关系:我们采用的特征基因选择的基于RF的方法与向后消除思想掺入处理结肠癌的数据和最终确定60个基因从而可以更好地分类肿瘤和正常样品。对于生物解释和验证这些基因,途径分析和基因功能注释的目的也称为。结果表明,我们的方法是可用的,用于识别的特征的基因和精确用于结肠肿瘤的分子分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号