首页> 外文会议> >Method of Classification through Normal Distribution Approximation Using Estimating the Adjacent and Multidimensional Scaling
【24h】

Method of Classification through Normal Distribution Approximation Using Estimating the Adjacent and Multidimensional Scaling

机译:估计正态分布近似和多维尺度的分类方法

获取原文

摘要

A Two types of classification methods are applied performing classification using machine learning: (1) those for which there is a presumption of data distribution using kernel functions, such as in support vector machine (SVM) and (2) those for which data distribution is not presumed, such as the k-NN. With such methods, it is easy to obtain data whose values are close for the class as a whole. In addition, it is assumed that there is a high probability that data with close values apparently belong to the same class. In contrast, while it may be easy to obtain approximate values of data for each class, there exists a relatively high probability of approximate values manifesting as data of the same class. Consequently, there is a high probability of the same class of data appearing around the position of parameters of the data manifested in feature space where data parameters are obtained. This study proposes machine learning algorithms that approximates this using a density function of normal distribution. In addition, if small amounts of data of a different class exist where data of the same class is being collected, the impact of those classes will be significant. This study propose two types of algorithms. One is a method of calculating the influence of the proximity of the training data for the entire feature space calculates. The other is a method of calculating the effect on the entire feature space for each training data. Both methods utilize two parameters as the influence of each training data and the range to be used as neighborhood data. Two parameters determine the quasi-optimal solution by the steepest descent method. Furthermore, in order to reduce the density of the influence of the training data, we propose improved method that relocates the training data from the distance between the training data by multidimensional scaling as preprocessing.
机译:应用了两种类型的分类方法,以使用机器学习进行分类:(1)那些使用内核函数推定数据分布的方法,例如在支持向量机(SVM)中;(2)那些使用数据分布的方法不可推定,例如k-NN。使用这种方法,很容易获得其值对于整个类而言都接近的数据。另外,假定具有接近值的数据很明显属于同一类的可能性很高。相反,虽然可能容易获得每个类别的数据的近似值,但是存在近似值表现为相同类别的数据的相对较高的可能性。因此,很有可能在获得数据参数的特征空间中表现出的数据参数的位置周围出现同一类数据。这项研究提出了一种机器学习算法,该算法使用正态分布的密度函数对此进行了近似。另外,如果在收集同一类别的数据的地方存在少量不同类别的数据,那么这些类别的影响将是巨大的。这项研究提出了两种类型的算法。一种方法是计算对于整个特征空间计算的训练数据的邻近度的影响。另一种是针对每个训练数据计算对整个特征空间的影响的方法。两种方法都利用两个参数作为每个训练数据的影响,并将范围用作邻域数据。两个参数通过最速下降法确定准最优解。此外,为了降低训练数据的影响密度,我们提出了一种改进的方法,该方法通过多维缩放从训练数据之间的距离重新定位训练数据,作为预处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号