Automatic POI Matching Using an Outlier Detection Based Approach

机译：使用基于异常检测的方法自动POI匹配

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Points of Interest (POI) are widely used in many applications nowadays mainly due to the increasing amount of related data available online, notably from volunteered geographic information (VGI) sources. Being able to connect these data from different sources is useful for many things like validating, correcting and also removing duplicated data in a database. However, there is no standard way to identify the same POIs across different sources and doing it manually could be very expensive. Therefore, automatic POI matching has been an attractive research topic. In our work, we propose a novel data-driven machine learning approach based on an outlier detection algorithm to match POIs automatically. Surprisingly, works that have been presented so far do not use data-driven machine learning approaches. The reason for this might be that such approaches need a training dataset to be constructed by manually matching some POIs. To mitigate this, we have taken advantage of the Cross-walk API, available at the time we started our project, which allowed us to retrieve already matched POI data from different sources in US territory. We trained and tested our model with a dataset containing Factual, Facebook and Foursquare POIs from New York City and were able to successfully apply it to another dataset of Facebook and Foursquare POIs from Porto, Portugal, finding matches with an accuracy around 95%. These are encouraging results that confirm our approach as an effective way to address the problem of automatically matching POIs. They also show that such a model can be trained with data available from multiple sources and be applied to other datasets with different locations from those used in training. Furthermore, as a data-driven machine learning approach, the model can be continuously improved by adding new validated data to its training dataset.

机译：兴趣点（POI）现在广泛应用于许多应用，主要是由于在线提供的相关数据量增加，特别是来自志愿地理信息（VGI）来源。能够从不同来源连接这些数据对于许多验证，纠正和删除数据库中的重复数据是有用的。但是，没有标准的方法来识别不同来源的相同POI，并且手动执行它可能非常昂贵。因此，自动POI匹配是一个有吸引力的研究主题。在我们的工作中，我们提出了一种基于异常检测算法的新型数据驱动的机器学习方法来自动匹配POI。令人惊讶的是，到目前为止所呈现的作品不使用数据驱动的机器学习方法。这可能是这种方法需要通过手动匹配一些POI来构建训练数据集。为了缓解这一点，我们利用了跨漫步的API，当时我们启动了我们的项目，允许我们从美国领域的不同来源中检索已经匹配的POI数据。我们使用来自纽约市的数据集进行培训并测试了我们的模型，该模型来自纽约市的Facebook和Foursquare Pois，并且能够从Porto，葡萄牙的另一个Facebook和Foursquare Pois上运用它，找到匹配的比赛，精度约为95％。这些是令人鼓舞的结果，确认我们的方法是解决自动匹配POIS问题的有效方法。他们还表明，可以使用多个来源提供的数据培训此类模型，并应用于具有培训中使用的其他地点的其他数据集。此外，作为数据驱动的机器学习方法，可以通过将新的验证数据添加到其训练数据集来持续提高模型。

著录项

来源
《International Symposium on Intelligent Data Analysis》|2018年|394p|共12页
会议地点
作者
Alexandre Almeida; Ana Alves; Rui Gomes;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Machine learning; Outlier detection; Point-Of-Interest; GIS;

机译：机器学习;异常检测;兴趣点;GIS;

相似文献

外文文献
中文文献
专利

1. Automatic Near Real-Time Outlier Detection and Correction in Cardiac Interbeat Interval Series for Heart Rate Variability Analysis: Singular Spectrum Analysis-Based Approach [J] . Michael Lang JMIR Biomedical Engineering . 2019,第1期

机译：心跳变异性分析的心律间隔时间系列中的近实时实时离群值自动校正：基于奇异谱分析的方法
2. COMPARISON OF OUTLIER DETECTION AT THE EDGES OF POINT CLOUDS USING STATISTICAL APPROACH AND FUZZY METHODOLOGY: GROUND-BASED LASER SCANNER FIELD EXPERIMENT AND RANDOMLY SIMULATED POINT CLOUD [J] . Reza ARABSHEIBANI, Abbas ABEDINI, Yousef KANANI SADAT Geodesy and Cartography . 2015,第3期

机译：统计方法和模糊方法在点云边缘检测的比较：基于地面的激光扫描仪场实验和随机模拟的点云
3. Automatic Laser Pointer Detection Algorithm for Environment Control Device Systems Based on Template Matching and Genetic Tuning of Fuzzy Rule-Based Systems [J] . Chavez F., Fernandez F., Gacto M.J., International journal of computational intelligence systems . 2012,第1a6期

机译：基于模糊规则系统模板匹配和遗传调整的环境控制设备系统激光指针自动检测算法
4. Automatic POI Matching Using an Outlier Detection Based Approach [C] . Alexandre Almeida, Ana Alves, Rui Gomes International Symposium on Intelligent Data Analysis . 2018

机译：使用基于异常检测的方法自动POI匹配
5. Outlier detection and multicollinearity in sequential variable selection: A least angle regression-based approach. [D] . Kirtland, Kelly Meredith. 2017

机译：顺序变量选择中的异常值检测和多重共线性：基于最小角度回归的方法。
6. Automatic Deformable Surface Registration for Medical Applications byRadial Basis Function-Based Robust Point-Matching [O] . Youngjun Kim, Yong Hum Na, Lei Xing, -1

机译：用于医疗应用的自动变形表面配准基于径向基函数的鲁棒点匹配
7. Automatic Laser Pointer Detection Algorithm for Environment Control Device Systems Based on Template Matching and Genetic Tuning of Fuzzy Rule-Based Systems [O] . 2015

机译：基于模糊规则和模糊规则系统遗传调整的环境控制装置系统自动激光指示器检测算法
8. Fraud detection in medicare claims: A multivariate outlier detection approach [R] . Burr, T, Hale, C, Kantor, M 1997

机译：医疗保险索赔中的欺诈检测：多变量异常值检测方法

Automatic POI Matching Using an Outlier Detection Based Approach

摘要

著录项

相似文献

相关主题

期刊订阅