首页> 外文学位 >Iterative feature weighting for identification of relevant features in machine learning: With multilayer perceptron, radial basis function and support vector architectures.
【24h】

Iterative feature weighting for identification of relevant features in machine learning: With multilayer perceptron, radial basis function and support vector architectures.

机译:用于识别机器学习中相关特征的迭代特征加权:具有多层感知器,径向基函数和支持向量体系结构。

获取原文
获取原文并翻译 | 示例

摘要

In multivariate data analysis, samples may be described in terms of many features, but in specific tasks some features may be redundant or irrelevant, serving primarily as sources of noise and confusion. The irrelevant and redundant features not only increase the cost of data collection, but may also be the reason why machine learning is often hampered by lack of an adequate number of samples.; Feature selection can be used to address the issue by identifying and selecting only those features that are relevant to the specific task in question. An alternate approach is feature weighting which assigns continuous-valued weights to each and all the features used in the description of data samples. Feature weighting can help reduce the effect of irrelevant features by assigning smaller weights to them and larger weights to relevant features.; In this dissertation, we study the effect of irrelevant features on neural network design and propose a framework for iterative feature weighting with neural networks. The framework iteratively improves the trained neural networks until reaching the optimal network model. On the other hand, feature weights are evaluated through trained neural networks and hence they converge to the optimal solutions as well. We present a convergence theorem to guide the design of the framework and then implement the framework for three typical neural network architectures. The implementations of these iterative feature weighting methods are applied to locally synthesized data and to benchmark datasets, and good results have been obtained. Results for the MONK's problems show that these methods are very effective in identifying relevant features that have complex logical relationships in data. Results for the Boston housing data show that the performances of regression models can be improved through iterative feature weighting. Results for the Leukemia gene expression data show that these methods can be used not only to improve the accuracy of pattern classification, but also to identify features that may have subtle nonlinear correlation to the task in question.
机译:在多变量数据分析中,可以根据许多特征来描述样本,但是在特定任务中,某些特征可能是多余的或不相关的,主要充当噪声和混乱的来源。不相关和冗余的功能不仅增加了数据收集的成本,而且可能也是机器学习经常因缺少足够数量的样本而受到阻碍的原因。通过仅识别和选择与所讨论的特定任务相关的那些功能,可以使用功能选择来解决该问题。另一种方法是特征加权,它为数据样本描述中使用的每个和所有特征分配连续值的权重。特征权重可以通过将较小的权重分配给不相关的特征,将较大的权重分配给相关特征来帮助减少不相关特征的影响。本文研究了不相关特征对神经网络设计的影响,提出了神经网络迭代特征加权的框架。该框架迭代地改进了训练后的神经网络,直到达到最佳网络模型为止。另一方面,特征权重是通过训练后的神经网络进行评估的,因此它们也收敛于最优解。我们提出了一个收敛定理来指导框架的设计,然后为三种典型的神经网络体系结构实现该框架。这些迭代特征加权方法的实现被应用于局部合成数据和基准数据集,并获得了良好的结果。 MONK问题的结果表明,这些方法对于识别在数据中具有复杂逻辑关系的相关特征非常有效。波士顿住房数据的结果表明,可以通过迭代特征加权来改善回归模型的性能。白血病基因表达数据的结果表明,这些方法不仅可以用于提高模式分类的准确性,而且可以用于识别可能与所讨论任务具有微妙非线性相关性的特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号