...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Efficient attribute-oriented generalization for knowledge discovery from large databases
【24h】

Efficient attribute-oriented generalization for knowledge discovery from large databases

机译:高效的面向属性的泛化,可从大型数据库中发现知识

获取原文
获取原文并翻译 | 示例

摘要

We present GDBR (Generalize DataBase Relation) and FIGR (Fast, Incremental Generalization and Regeneralization), two enhancements of Attribute Oriented Generalization, a well known knowledge discovery from databases technique. GDBR and FIGR are both O(n) and, as such, are optimal. GDBR is an online algorithm and requires only a small, constant amount of space. FIGR also requires a constant amount of space that is generally reasonable, although under certain circumstances, may grow large. FIGR is incremental, allowing changes to the database to be reflected in the generalization results without rereading input data. FIGR also allows fast regeneralization to both higher and lower levels of generality without rereading input. We compare GDBR and FIGR to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively, where n is the number of input tuples and p the number of tuples in the generalized relation. Both require O(n) space that, for large input, causes memory problems. We implemented all four algorithms and ran empirical tests, and we found that GDBR and FIGR are faster. In addition, their runtimes increase only linearly as input size increases, while the runtimes of LCHR and AOI increase greatly when input size exceeds memory limitations.
机译:我们介绍了GDBR(通用数据库关系)和FIGR(快速,增量通用化和重新通用化),这是面向属性的通用化的两项增强功能,这是一种从数据库技术中获得的知名知识。 GDBR和FIGR均为O(n),因此是最佳的。 GDBR是一种在线算法,仅需要少量恒定的空间。尽管在某些情况下,FIGR也可能需要增大一定的空间,该空间通常是合理的。 FIGR是增量式的,允许对数据库的更改反映在泛化结果中,而无需重新读取输入数据。 FIGR还允许在不重新读取输入的情况下将通用性快速重新概括为较高和较低的通用性。我们将GDBR和FIGR与两个先前的算法LCHR和AO​​I进行比较,它们分别为O(n log n)和O(np),其中n是输入元组的数量,p是广义关系中的元组的数量。两者都需要O(n)空间,对于大的输入,这会导致内存问题。我们实现了所有四种算法并进行了经验测试,发现GDBR和FIGR更快。此外,它们的运行时间仅随着输入大小的增加而线性增加,而LCHR和AO​​I的运行时间在输入大小超出内存限制时会大大增加。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号