...
首页> 外文期刊>Artificial intelligence in medicine >Filter versus wrapper gene selection approaches in DNA microarray domains
【24h】

Filter versus wrapper gene selection approaches in DNA microarray domains

机译:DNA芯片领域中的过滤与包装基因选择方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

DNA microarray experiments generating thousands of gene expression measurements, are used to collect information from tissue and cell samples regarding gene expression differences that could be useful for diagnosis disease, distinction of the specific tumor type, etc. One important application of gene expression microarray data is the classification of samples into known categories. As DNA microarray technology measures the gene expression en masse, this has resulted in data with the number of features (genes) far exceeding the number of samples. As the predictive accuracy of supervised classifiers that try to discriminate between the classes of the problem decays with the existence of irrelevant and redundant features, the necessity of a dimensionality reduction process is essential. We propose the application of a gene selection process, which also enables the biology researcher to focus on promising gene candidates that actively contribute to classification in these large scale microarrays. Two basic approaches for feature selection appear in machine learning and pattern recognition literature: the filter and wrapper techniques. Filter procedures are used in most of the works in the area of DNA microarrays. In this work, a comparison between a group of different filter metrics and a wrapper sequential search procedure is carried out. The comparison is performed in two well-known DNA microarray datasets by the use of four classic supervised classifiers. The study is carried out over the original-continuous and three-intervals discretized gene expression data. While two well-known filter metrics are proposed for continuous data, four classic filter measures are used over discretized data. The same wrapper approach is used for both continuous and discretized data. The application of filter and wrapper gene selection procedures leads to considerably better accuracy results in comparison to the non-gene selection approach, coupled with interesting and notable dimensionality reductions. Although the wrapper approach mainly shows a more accurate behavior than filter metrics, this improvement is coupled with considerable computer-load necessities. We note that most of the genes selected by proposed filter and wrapper procedures in discrete and continuous microarray data appear in the lists of relevant-informative genes detected by previous studies over these datasets. The aim of this work is to make contributions in the field of the gene selection task in DNA microarray datasets. By an extensive comparison with more popular filter techniques, we would like to make contributions in the expansion and study of the wrapper approach in this type of domains.
机译:DNA微阵列实验可产生数千种基因表达测量值,可用于从组织和细胞样本中收集有关基因表达差异的信息,这些信息可用于诊断疾病,区分特定肿瘤类型等。基因表达微阵列数据的一项重要应用是将样本分类为已知类别。随着DNA微阵列技术大规模测量基因表达,这导致特征(基因)数量远远超过样本数量的数据。由于试图区分问题类别的监督分类器的预测准确性随着不相关和冗余特征的存在而下降,因此降维过程的必要性至关重要。我们提出了基因选择过程的应用,这也使生物学研究者能够专注于有前途的候选基因,这些候选基因对这些大规模微阵列的分类做出了积极贡献。机器学习和模式识别文献中出现了两种用于特征选择的基本方法:过滤器和包装器技术。 DNA微阵列领域的大多数工作中都使用了过滤程序。在这项工作中,在一组不同的过滤器指标和包装程序顺序搜索过程之间进行了比较。通过使用四个经典的监督分类器,在两个著名的DNA微阵列数据集中进行比较。该研究是针对原始连续和三间隔离散基因表达数据进行的。虽然针对连续数据提出了两个众所周知的过滤指标,但对离散化数据使用了四个经典过滤指标。连续数据和离散数据都使用相同的包装方法。与非基因选择方法相比,过滤器和包装器基因选择程序的应用可导致更好的准确性结果,并具有有趣且显着的尺寸减小。尽管包装器方法主要表现出比过滤器指标更准确的行为,但这种改进与相当大的计算机负载需求结合在一起。我们注意到,在离散和连续微阵列数据中,通过拟议的过滤器和包装器程序选择的大多数基因出现在以前的研究针对这些数据集检测到的相关信息性基因列表中。这项工作的目的是在DNA微阵列数据集中的基因选择任务领域做出贡献。通过与更流行的过滤器技术进行广泛的比较,我们希望为此类领域中包装方法的扩展和研究做出贡献。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号