首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Finding Probabilistic Prevalent Colocations in Spatially Uncertain Data Sets
【24h】

Finding Probabilistic Prevalent Colocations in Spatially Uncertain Data Sets

机译:在空间不确定的数据集中查找概率普遍共处

获取原文
获取原文并翻译 | 示例

摘要

A spatial colocation pattern is a group of spatial features whose instances are frequently located together in geographic space. Discovering colocations has many useful applications. For example, colocated plant species discovered from plant distribution data sets can contribute to the analysis of plant geography, phytosociology studies, and plant protection recommendations. In this paper, we study the colocation mining problem in the context of uncertain data, as the data generated from a wide range of data sources are inherently uncertain. One straightforward method to mine the prevalent colocations in a spatially uncertain data set is to simply compute the expected participation index of a candidate and decide if it exceeds a minimum prevalence threshold. Although this definition has been widely adopted, it misses important information about the confidence which can be associated with the participation index of a colocation. We propose another definition, probabilistic prevalent colocations, trying to find all the colocations that are likely to be prevalent in a randomly generated possible world. Finding probabilistic prevalent colocations (PPCs) turn out to be difficult. First, we propose pruning strategies for candidates to reduce the amount of computation of the probabilistic participation index values. Next, we design an improved dynamic programming algorithm for identifying candidates. This algorithm is suitable for parallel computation, and approximate computation. Finally, the effectiveness and efficiency of the methods proposed as well as the pruning strategies and the optimization techniques are verified by extensive experiments with “real $(+)$ synthetic” spatially uncertain data sets.
机译:空间共置模式是一组空间要素,其实例经常一起位于地理空间中。发现主机代管有许多有用的应用程序。例如,从植物分布数据集中发现的并置植物物种可有助于植物地理分析,植物社会学研究和植物保护建议。在本文中,我们研究不确定数据环境下的共置挖掘问题,因为从各种数据源生成的数据本质上是不确定的。一种在空间不确定的数据集中挖掘普遍共处的简单方法是简单地计算候选人的预期参与指数,并确定其是否超过最小流行阈值。尽管此定义已被广泛采用,但它错过了有关可与主机代管的参与指数相关的置信度的重要信息。我们提出了另一种定义,概率普遍代词,试图找到在随机生成的可能世界中可能普遍的所有代词。事实证明,找到概率普遍代管(PPC)很困难。首先,我们为候选人提出了修剪策略,以减少概率参与指数值的计算量。接下来,我们设计了一种改进的动态规划算法来识别候选人。该算法适用于并行计算和近似计算。最后,通过使用“真实的(+)$合成”空间不确定数据集进行的大量实验,验证了所提出方法的有效性和效率以及修剪策略和优化技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号