【24h】

COMBI-Operator ― Database Support for Data Mining Applications

机译:COMBI-Operator ―数据挖掘应用程序的数据库支持

获取原文
获取原文并翻译 | 示例

摘要

Database support for data mining has become an important research topic. Especially for large high-dimensional data volumes, comprehensive support from the database side is necessary. In this paper we identify the data intensive subproblem of aggregating high-dimensional data in all possible low-dimensional projections (for instance estimating low-dimensional histograms), which occurs in several established data mining techniques. Second, we show that existing OLAP SQL-extensions are insufficient for high-dimensional data and propose a new SQL-operator, which seamlessly fits into the set of existing OLAP Group By operators. Third, we propose efficient implementations for the operator, which take the limited resources of main memory into account. We demonstrate on a number of real and synthetic data sets that for the identified subproblem our new implementations yield a large speedup (up to factor 10) over existing methods built in commercially available database systems.
机译:数据库对数据挖掘的支持已经成为重要的研究课题。特别是对于大型高维数据量,需要数据库方面的全面支持。在本文中,我们确定了在所有可能的低维投影中汇总高维数据的数据密集型子问题(例如,估计低维直方图),这在几种已建立的数据挖掘技术中都会发生。其次,我们表明现有的OLAP SQL扩展不足以用于高维数据,并提出了一个新的SQL运算符,该运算符无缝地适合现有的OLAP Group By运算符集合。第三,我们为操作员提出了有效的实现方案,该方案考虑了主存储器的有限资源。我们在大量的实际和综合数据集上证明,对于已确定的子问题,我们的新实施方案比市售数据库系统中构建的现有方法可大幅提高速度(最高10倍)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号