Feature Selection and Classification on Matrix Data: From Large Margins To Small Covering Numbers

机译：矩阵数据的特征选择和分类：从大边缘到小覆盖号码

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We investigate the problem of learning a classification task for datasets which are described by matrices. Rows and columns of these matrices correspond to objects, where row and column objects may belong to different sets, and the entries in the matrix express the relationships between them. We interpret the matrix elements as being produced by an unknown kernel which operates on object pairs and we show that - under mild assumptions - these kernels correspond to dot products in some (unknown) feature space. Minimizing a bound for the generalization error of a linear classifier which has been obtained using covering numbers we derive an objective function for model selection according to the principle of structural risk minimization. The new objective function has the advantage that it allows the analysis of matrices which are not positive definite, and not even symmetric or square. We then consider the case that row objects are interpreted as features. We suggest an additional constraint, which imposes sparseness on the row objects and show, that the method can then be used for feature selection. Finally, we apply this method to data obtained from DNA microar-rays, where "column" objects correspond to samples, "row" objects correspond to genes and matrix elements correspond to expression levels. Benchmarks are conducted using standard one-gene classification and support vector machines and K-nearest neighbors after standard feature selection. Our new method extracts a sparse set of genes and provides superior classification results.

机译：我们研究了学习矩阵描述的数据集的分类任务的问题。这些矩阵的行和列对应于对象，其中行和列对象可以属于不同的集合，并且矩阵中的条目表达了它们之间的关系。我们将矩阵元素解释为由Object对操作的未知内核产生，并且我们在温和的假设下显示 - 这些内核对应于某些（未知）特征空间中的点产品。最小化使用覆盖数获得的线性分类器的泛化误差的界限我们导出了根据结构风险最小化原理的模型选择的目标函数。新的客观函数具有以下优点：它允许分析不是正定的矩阵，甚至是对称或正方形的矩阵。然后，我们考虑将行对象被解释为特征的情况。我们建议一个额外的约束，它对行对象并显示稀疏，然后可以用于特征选择。最后，我们将该方法应用于从DNA微型光线获得的数据，其中“列”对象对应于样本，“行”对象对应于基因和矩阵元素对应于表达水平。在标准特征选择后，使用标准的单基因分类和支持向量机和k最近邻居进行基准。我们的新方法提取了一组稀疏的基因，并提供了卓越的分类结果。

著录项

来源
《Annual neural information processing systems conference》|2003年||共8页
会议地点
作者
Sepp Hochreiter; Klaus Obermayer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Matrix-Based Margin-Maximization Band Selection With Data-Driven Diversity for Hyperspectral Image Classification [J] . Xiaohui Wei, Wen Zhu, Bo Liao, IEEE Transactions on Geoscience and Remote Sensing. . 2018,第12期

机译：基于矩阵的具有数据驱动分集的余量最大化频带选择，用于高光谱图像分类
2. A new feature selection method on classification of medical datasets: Kernel F-score feature selection [J] . Kemal Polat, Salih Guenes Expert systems with applications . 2009,第7期

机译：一种新的医学数据分类特征选择方法：内核F分数特征选择
3. Feature selection for multi-class classification using pairwise class discriminatory measure and covering concept [J] . Hyeon Ji, Bang S.Y. Electronics Letters . 2000,第6期

机译：使用成对类别区分度量和覆盖概念的多类别分类特征选择
4. Feature Selection and Classification on Matrix Data: From Large Margins To Small Covering Numbers [C] . Sepp Hochreiter, Klaus Obermayer Annual neural information processing systems conference . 2003

机译：矩阵数据的特征选择和分类：从大边缘到小覆盖号码
5. Comparative Analysis of Feature Selection and Classification Methods for Epigenetic Methylation Data [D] . Kleyn, Aaron. 2021

机译：表观甲基化数据特征选择和分类方法的比较分析
6. Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease [O] . Ehsan Adeli, Guorong Wu, Behrouz Saghafi, -1

机译：基于核的联合特征选择和最大利润分类对帕金森氏病的早期诊断
7. Non-negative Matrix Factorization as a Feature Selection Tool for Maximum Margin Classifiers [O] . Mithun Das Gupta, Jing Xiao 2013

机译：非负矩阵分解作为最大余量分类器的特征选择工具

Feature Selection and Classification on Matrix Data: From Large Margins To Small Covering Numbers

摘要

著录项

相似文献

相关主题

期刊订阅