首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Learning Effective Query Management Strategies from Big Data
【24h】

Learning Effective Query Management Strategies from Big Data

机译:从大数据中学习有效的查询管理策略

获取原文

摘要

The availability of big data collections, together with powerful hardware and software mechanisms to process them, gives nowadays the possibility to learn useful insights from data, which can be exploited for multiple purposes, including marketing, fault prevention, and so forth. However, it is also possible to learn important metadata that can suggest how data should be manipulated in several advanced operations. In this paper, we show the potentiality of learning from data by focusing on the problem of relaxing the results of database queries, that is, trying to return some approximated answer to a query when a result for it is unavailable in the database, and the system will return an empty answer set, or even worse, erroneous mismatch results. In particular, we introduce a novel approach to rewrite queries that are in disjunctive normal form and contain a mixture of discrete and continuous attributes. The approach preprocesses data collections to discover the implicit relationships that exist among the various domain attributes, and then uses this knowledge to rewrite the constraints from the failing query. In a first step, the approach tries to learn a set of functional dependencies from the data, which are ranked according to special mechanisms that will successively allow to predict the order in which the extracted dependencies have to be used to properly rewrite the failing query. An experimental evaluation of the approach on three real data sets shows its effectiveness in terms of robustness and coverage.
机译:如今,大数据集合的可用性以及处理它们的强大硬件和软件机制使人们有可能从数据中学习有用的见解,这些见解可用于多种目的,包括营销,故障预防等。但是,也可以学习重要的元数据,这些元数据可以建议应如何在几个高级操作中操作数据。在本文中,我们将重点放在放宽数据库查询结果的问题上,从而展示从数据中学习的潜力,即在数据库中没有查询结果的情况下,尝试向查询返回一些近似答案,以及系统将返回一个空的答案集,甚至更糟糕的是,错误的不匹配结果。特别是,我们引入了一种新颖的方法来重写处于析取范式且包含离散和连续属性的混合形式的查询。该方法对数据收集进行预处理,以发现各种域属性之间存在的隐式关系,然后使用此知识来重写失败查询中的约束。在第一步中,该方法尝试从数据中学习一组功能依赖项,这些功能依赖项将根据特殊机制进行排序,这些特殊机制将依次允许预测所提取的依赖项必须用来正确重写失败查询的顺序。在三个真实数据集上对该方法进行的实验评估表明,该方法在鲁棒性和覆盖范围方面均有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号