Lowest probability mass neighbour algorithms: relaxing the metric constraint in distance-based neighbourhood algorithms

Ting Kai Ming; Zhu Ye; Carman Mark; Zhu Yue; Washio Takashi; Zhou Zhi-Hua

首页> 外文期刊>Machine Learning >Lowest probability mass neighbour algorithms: relaxing the metric constraint in distance-based neighbourhood algorithms

【24h】

Lowest probability mass neighbour algorithms: relaxing the metric constraint in distance-based neighbourhood algorithms

机译：最低概率质量邻域算法：放宽基于距离的邻域算法中的度量约束

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The use of distance metrics such as the Euclidean or Manhattan distance for nearest neighbour algorithms allows for interpretation as a geometric model, and it has been widely assumed that the metric axioms are a necessary condition for many data mining tasks. We show that this assumption can in fact be an impediment to producing effective models. We propose to use mass-based dissimilarity, which employs estimates of the probability mass to measure dissimilarity, to replace the distance metric. This substitution effectively converts nearest neighbour (NN) algorithms into lowest probability mass neighbour (LMN) algorithms. Both types of algorithms employ exactly the same algorithmic procedures, except for the substitution of the dissimilarity measure. We show that LMN algorithms overcome key shortcomings of NN algorithms in classification and clustering tasks. Unlike existing generalised data independent metrics (e.g., quasi-metric, meta-metric, semi-metric, peri-metric) and data dependent metrics, the proposed mass-based dissimilarity is unique because its self-dissimilarity is data dependent and non-constant.

机译：距离度量（例如欧几里得距离或曼哈顿距离）用于最近邻算法可将其解释为几何模型，并且已广泛假定度量公理是许多数据挖掘任务的必要条件。我们表明，这种假设实际上可能会阻碍产生有效的模型。我们建议使用基于质量的不相似性，该方法采用对概率质量的估计来测量不相似性，以取代距离度量。该替换有效地将最近邻居（NN）算法转换为最低概率质量邻居（LMN）算法。两种类型的算法都采用完全相同的算法过程，不同之处在于替换了相异性度量。我们表明，LMN算法克服了神经网络算法在分类和聚类任务中的关键缺点。与现有的广义数据无关度量（例如，准度量，元度量，半度量，周度度量）和数据相关度量不同，建议的基于质量的差异是唯一的，因为其自相似性是数据相关且非恒定的。

著录项

来源
《Machine Learning》 |2019年第2期|331-376|共46页
作者
Ting Kai Ming; Zhu Ye; Carman Mark; Zhu Yue; Washio Takashi; Zhou Zhi-Hua;
展开▼
作者单位

Federat Univ, Sch Engn & Informat Technol, Churchill, Australia;

Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia;

Monash Univ, Fac Informat Technol, Melbourne, Vic, Australia;

Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China;

Osaka Univ, Inst Sci & Ind Res, Suita, Osaka, Japan;

Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Nearest neighbour; Distance metric; Lowest probability mass neighbour; Mass-based dissimilarity; Classification; Clustering;

机译：最近邻;距离度量;最低概率质量邻域;基于质量的不相似度;分类;聚类;

相似文献

外文文献
中文文献
专利

1. Using the two-population genetic algorithm with distance-based k-nearest neighbour voting classifier for high-dimensional data [J] . Lee Chien-Pang, Lin Wen-Shin International journal of data mining and bioinformatics . 2016,第4期

机译：结合基于距离的k最近邻投票分类器的两种群遗传算法处理高维数据
2. Probability Density Function of Reliability Metrics in BICM with Arbitrary Modulation: Closed-form through Algorithmic Approach [J] . Szczecinski L., Bettancourt R., Feick R. IEEE Transactions on Communications . 2008,第5期

机译：具有任意调制的BICM中可靠性度量的概率密度函数：通过算法的闭合形式
3. PROBABILITY METRICS AND RECURSIVE ALGORITHMS [J] . Rachev ST., Ruschendorf L. Advances in applied probability . 1995,第3期

机译：概率度量和递归算法
4. A relaxing multi-constraint routing algorithm by considering QoS metrics priority for wired network [C] . Sanguankotchakorn Teerapat, Maneepong Surabodi, Sugino Nobuhiko International Conference on Ubiquitous and Future Networks . 2013

机译：考虑有线网络QoS度量优先级的轻松多约束路由算法
5. Empirical performance analysis of two algorithms for mining intentional knowledge of distance-based outliers. [D] . Prasanthi, Enbamoorthy. 2005

机译：两种基于距离的离群值的有意知识挖掘算法的实证性能分析。
6. A mass accuracy sensitive probability based scoring algorithm for database searching of tandem mass spectrometry data [O] . Hua Xu, Michael A Freitas 2007

机译：基于质量精度敏感概率的评分算法用于串联质谱数据的数据库搜索
7. Lowest probability mass neighbour algorithms: relaxing the metric constraint in distance-based neighbourhood algorithms [O] . Kai Ming Ting, Ye Zhu, Mark Carman, 2018

机译：最低概率质量邻居算法：放松基于距离的邻域算法中的公制约束

Lowest probability mass neighbour algorithms: relaxing the metric constraint in distance-based neighbourhood algorithms

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅