...
首页> 外文期刊>Future generation computer systems >Handling missing values for mining gradual patterns from NoSQL graph databases
【24h】

Handling missing values for mining gradual patterns from NoSQL graph databases

机译:处理来自NoSQL图表数据库的挖掘逐渐模式的缺失值

获取原文
获取原文并翻译 | 示例
           

摘要

Graph databases (NoSQL oriented graph databases) provide the ability to manage highly connected data and complex database queries along with the native graph-storage and processing. A property graph in a NoSQL graph engine is a labeled directed graph composed of nodes connected through edges with a set of attributes or properties in the form of (key : value) pairs. It facilitates to represent the data and knowledge that are in form of graphs. Practical applications of graph database systems have been seen in social networks, recommendation systems, fraud detection, and data journalism, as in the case for panama papers. Often, we face the issue of missing data in such kind of systems. In particular, these semi-structured NoSQL databases lead to a situation where some attributes (properties) are filled-in while other ones are not available, either because they exist but are missing (for instance the age of a person that is unknown) or because they are not applicable for a particular case (for instance the year of military service for a girl in countries where it is mandatory only for boys). Therefore, some keys can be provided for some nodes and not for other ones. In such a scenario, when we want to extract knowledge from these new generation database systems, we face the problem of missing data that arises need for analyzing them. Some approaches have been proposed to replace missing values so as to be able to apply data mining techniques. However, we argue that it is not relevant to consider such approaches because they may introduce biases or errors. In our work, we focus on the extraction of gradual patterns from property graphs that provide end-users with tools for mining correlations in the data when there exist missing values. Our approach requires first to define gradual patterns in the context of NoSQL property graph and then to extend existing algorithms so as to treat the missing values, because anti-monotonicity of the support cannot be considered anymore in a simple manner. Thus, we introduce a novel approach for mining gradual patterns in the presence of missing values and we test it on real and synthetic data.
机译:图数据库(NOSQL DioriDened Graph数据库)提供管理高连接数据和复杂数据库查询以及本机图存储和处理的能力。 NoSQL Graph引擎中的属性图是由通过边缘连接的节点组成的标记的定向图,其属性或属性(key:value)对形式。它有助于代表图形形式的数据和知识。图表数据库系统的实际应用已在社交网络,推荐系统,欺诈检测和数据新闻中看到,如巴拿马文件的情况。通常,我们面临如此系统中缺少数据的问题。特别是,这些半结构化的NoSQL数据库导致某些属性(属性)填写的情况,而其他的属性(属性)是因为它们存在而是缺少(例如未知人的年龄)或因为它们不适用于特定案件(例如,在仅适用于男孩的国家的女孩的兵役年份)。因此,可以为某些节点提供一些键,而不是用于其他节点。在这样的场景中,当我们想从这些新一代数据库系统中提取知识时,我们面临丢失数据的问题,这些数据都需要分析它们。已经提出了一些方法来替换缺失的值,以便能够应用数据挖掘技术。但是,我们认为这与考虑此类方法不相关,因为它们可能引入偏见或错误。在我们的工作中,我们专注于从物业图表中提取逐步模式,为最终用户提供有关存在缺失值时数据中的用于挖掘相关的工具。我们的方法首先需要在NoSQL属性图的上下文中定义渐变模式,然后延长现有算法以便处理缺失值,因为不能以简单的方式再考虑支持的反单调。因此,我们在缺失值存在下介绍挖掘逐渐模式的新方法,并在真实和合成数据上测试它。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号