MR~2: A Two-stage Feature Selection Algorithm in High-throughput Methylation Data for Max-relevance and Min-redundancy

机译：MR〜2：用于最大相关性和最小冗余的高吞吐量甲基化数据中的两级特征选择算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent advances reveal that DNA methylation plays an important role in regulating different genome functions where anomalous methylation levels are associated with various cancer types. Feature selection algorithms are geared towards high-throughput analysis of DNA methylation to help identify idiosyncratic DNA methylation profiles associated with cancer types and subtypes. In high dimensional and highly correlated DNA methylation data, feature selection algorithms aim at selecting an efficient and comprehensive feature set to better capture characteristics of phenotypes. In this work, we introduce a two-stage feature selection algorithm (MR2) based on maximum relevance and minimum redundancy criteria. The features that satisfy the relevance conditions are filtered in the first stage, in the second stage, the final subset of loci is selected to reach minimal redundancy by using a k-medoids clustering algorithm that embeds a succinct uncertainty measure score. The performance of the proposed feature selection algorithm is benchmarked against those of the principal component analysis and four other commonly used filtering methods using lung and breast cancer datasets obtained from Gene Expression Omnibus in terms of their classification errors in support vector machine classifiers. Our MR2 algorithm outperforms these filtering based algorithms while at the same time providing more interpretable results.

机译：最新进展表明，DNA甲基化在调控那些异常甲基化水平与不同癌症类型相关联的不同基因组的功能具有重要作用。特征选择算法对DNA甲基化的高通量分析，以帮助确定癌症类型和亚型相关特质的DNA甲基化谱为目标。在高维和高度相关的DNA甲基化数据，特征选择算法的目的是选择有效的和全面的功能集，以表型更好地捕获特性。在这项工作中，我们将介绍基于最大相关性和最小冗余准则的两阶段特征选择算法（MR2）。满足相关性条件的特征进行滤波在第一阶段，在第二阶段中，位点的子集的最终选择通过使用K-中心点划分聚类算法嵌入一个简洁的不确定性度量得分达到最小冗余。所提出的特征选择算法的性能进行基准对那些主成分分析的，并使用在支持向量机分类器的分类误差的条款从基因表达综合获得肺癌和乳腺癌的数据集4等常用的过滤方法。我们的MR2算法优于这些过滤算法的基础，而在同一时间提供更多可解释的结果。

著录项

来源
《Industrial and Systems Engineering Annual Conference and Expo》|2018年|729p|共6页
会议地点
作者
Haluk Damgacioglu; Nurcin Celik; Emrah Celik;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP29-53;
关键词
DNA methylation; Beta distribution; Feature selection; Classification; Minimal redundancy;

机译：DNA甲基化;β发行;特征选择;分类;最小的冗余;

相似文献

外文文献
中文文献
专利

1. Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data [J] . Yasser EL-Manzalawy, Tsung-Yu Hsieh, Manu Shivakumar, BMC Medical Genomics . 2018,第3期

机译：使用多组学数据预测卵巢癌生存率的最小冗余和最大相关性多视图特征选择
2. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy [J] . Hanchuan Peng, Fuhui Long, Ding C. IEEE Transactions on Pattern Analysis and Machine Intelligence . 2005,第8期

机译：基于最大相关性，最大相关性和最小冗余的互信息标准进行特征选择
3. A novel sub-models selection algorithm based on max-relevance and min-redundancy neighborhood mutual information [J] . Xiao Ling, Wang Chen, Dong Yunxuan, Information Sciences: An International Journal . 2019,第期

机译：一种基于MAX-相关性和MIN冗余邻域相互信息的新型子模型选择算法
4. MR~2: A Two-stage Feature Selection Algorithm in High-throughput Methylation Data for Max-relevance and Min-redundancy [C] . Haluk Damgacioglu, Nurcin Celik, Emrah Celik Industrial and Systems Engineering Annual Conference and Expo . 2018

机译：MR〜2：用于最大相关性和最小冗余的高吞吐量甲基化数据中的两级特征选择算法
5. Comparative Analysis of Feature Selection and Classification Methods for Epigenetic Methylation Data [D] . Kleyn, Aaron. 2021

机译：表观甲基化数据特征选择和分类方法的比较分析
6. Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data [O] . Yasser EL-Manzalawy, Tsung-Yu Hsieh, Manu Shivakumar, 2018

机译：使用多组学数据预测卵巢癌存活率的最小冗余和最大相关性多视图特征选择
7. Min-Redundancy and Max-Relevance Multi-view Feature Selection for Predicting Ovarian Cancer Survival using Multi-omics Data [O] . Yasser EL-Manzalawy, Tsung-Yu Hsieh, Manu Shivakumar, 2018

机译：使用多OMICS数据预测卵巢癌生存的最小冗余和最大相关性多视图特征选择
8. Data Mining Feature Subset Weighting and Selection Using Genetic Algorithms [R] . 2002

机译：基于遗传算法的数据挖掘特征子集加权和选择

MR~2: A Two-stage Feature Selection Algorithm in High-throughput Methylation Data for Max-relevance and Min-redundancy

摘要

著录项

相似文献

相关主题

期刊订阅