首页> 外文学位 >Biomolecular feature selection of colorectal cancer microarray data using GA-SVM hybrid and noise perturbation to address overfitting.
【24h】

Biomolecular feature selection of colorectal cancer microarray data using GA-SVM hybrid and noise perturbation to address overfitting.

机译:使用GA-SVM杂交技术和噪声扰动解决结直肠癌的大肠癌微阵列数据的生物分子特征选择。

获取原文
获取原文并翻译 | 示例

摘要

In 2008, there were over 100,000 newly reported cases of colon cancer, and 40,000 cases of rectal cancer in the United States. In order to minimize the number of deaths from these diseases, researchers have been striving to find a set of genes that can accurately characterize the correct prognosis for colorectal cancer. Working with a gene expression microarray dataset of about 55,000 genes, collected from 122 colorectal cancer patients, this research developed technology to identify an optimal set of features through several methods of feature selection. These methods included coarse feature reduction, fine feature selection, and classification using a Genetic Algorithm/Support Vector Machine (GA/SVM) hybrid. However, microarray data with dimensions such as these are feature rich and case poor, which can lead to dangers of overfitting to the data. In order to combat this issue, a noise perturbation scheme was introduced with the assumption that genes that are able to survive in this noise will have a strong relation to colorectal cancer. The feature reduction methods produced chromosomes containing genes with known relation to cancer. However, the perturbation analysis, which was designed to confirm these genes, was deemed inconclusive. This research was successful in developing a feature reduction method that was able to suggest a set of genes with potential ties to colorectal cancer, provoking further investigation into this relationship.
机译:在2008年,美国新报告的结肠癌病例超过100,000,直肠癌的病例为40,000。为了最大程度地减少这些疾病造成的死亡人数,研究人员一直在努力寻找一组可以准确表征大肠癌正确预后的基因。该研究与122个大肠癌患者收集的约55,000个基因的基因表达微阵列数据集合作,开发了通过多种特征选择方法识别最佳特征集的技术。这些方法包括粗略特征缩减,精细特征选择和使用遗传算法/支持向量机(GA / SVM)混合的分类。但是,具有此类尺寸的微阵列数据功能丰富且大小写较差,这可能会导致过拟合数据的危险。为了解决这个问题,引入了一种噪声扰动方案,其假设是能够在这种噪声中生存的基因将与结直肠癌有很强的关系。特征减少方法产生的染色体中含有与癌症相关的已知基因。但是,旨在确认这些基因的微扰分析被认为是不确定的。这项研究成功地开发了一种特征减少方法,该方法能够提出一组可能与结直肠癌相关的基因,从而引发了对此关系的进一步研究。

著录项

  • 作者

    Mizaku, Alda.;

  • 作者单位

    State University of New York at Binghamton.;

  • 授予单位 State University of New York at Binghamton.;
  • 学科 Engineering Biomedical.;Biology Bioinformatics.;Health Sciences Oncology.
  • 学位 M.S.
  • 年度 2009
  • 页码 80 p.
  • 总页数 80
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物医学工程;肿瘤学;
  • 关键词

  • 入库时间 2022-08-17 11:38:23

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号