Efficient Substructure Discovery from Large Semi-Structured Data

Tatsuya ASAI; Kenji ABE; Shinji KAWASOE; Hiroshi SAKAMOTO; Hiroki ARIMURA; Setsuo ARIKAWA

首页> 外文期刊>IEICE Transactions on Information and Systems >Efficient Substructure Discovery from Large Semi-Structured Data

【24h】

Efficient Substructure Discovery from Large Semi-Structured Data

机译：从大型半结构化数据中发现有效的子结构

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we consider a data mining problem for semi-structured data. Modeling semi-structured data as labeled ordered trees, we present an efficient algorithm for discovering frequent substructures from a large collection of semi-structured data. By extending the enumeration technique developed by Bayardo (SIGMOD'98) for discovering long item-sets, our algorithm scales almost linearly in the total size of maximal tree patterns contained in an input collection depending mildly on the size of the longest pattern. We also developed several pruning techniques that significantly speed-up the search. Experiments on Web data show that our algorithm runs efficiently on real-life datasets combined with proposed pruning techniques in the wide range of parameters.

机译：在本文中，我们考虑了半结构化数据的数据挖掘问题。将半结构化数据建模为标记的有序树，我们提出了一种从大量半结构化数据中发现频繁子结构的有效算法。通过扩展由Bayardo（SIGMOD'98）开发的枚举技术来发现长项目集，我们的算法几乎根据最长模式的大小线性地缩放输入集合中包含的最大树模式的总大小。我们还开发了几种修剪技术，可显着加快搜索速度。在Web数据上进行的实验表明，我们的算法在各种参数范围内结合拟议的修剪技术，可以在真实数据集上高效运行。

著录项

来源
《IEICE Transactions on Information and Systems》 |2004年第12期|p.2754-2763|共10页
作者
Tatsuya ASAI; Kenji ABE; Shinji KAWASOE; Hiroshi SAKAMOTO; Hiroki ARIMURA; Setsuo ARIKAWA;
展开▼
作者单位

Department of Informatics, Kyushu University, Fukuoka-shi, 812-8581 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
web mining; semi-structured data; association rule mining; itemset enumeration tree; labeled ordered trees; data mining algorithms;

机译：网络挖掘;半结构化数据;关联规则挖掘;项目集枚举树;带标签的有序树;数据挖掘算法;
入库时间 2022-08-18 00:30:25

相似文献

外文文献
中文文献
专利

1. Efficient substructure discovery from large semi-structured data [J] . Tatsuya Asai, Kenji Abe, Shinji Kawasoe, 電子情報通信学会技術研究報告. デ-タ工学. Data Engineering . 2001,第342期

机译：从大型半结构化数据中发现有效的子结构
2. Efficient substructure discovery from large semi-structured data [J] . Tatsuya Asai, Kenji Abe, Shinji Kawasoe, 電子情報通信学会技術研究報告. デ-タ工学. Data Engineering . 2001,第342期

机译：大量半结构化数据的高效子结构发现
3. DASS: efficient discovery and p-value calculation of substructures in unordered data [J] . Hollunder J, Friedel M, Beyer A, Bioinformatics . 2007,第26期

机译：DASS：无序数据中子结构的高效发现和p值计算
4. Efficient Substructure Discovery from Large Semi-structured Data [C] . Tatsuya Asai, Kenji Abe, Shinji Kawasoe, SIAM International Conference on Data Mining . 2002

机译：高效的子结构发现来自大型半结构化数据
5. Towards efficient data analysis and management of semi-structured data. [D] . Tatikonda, Shirish. 2010

机译：致力于高效的数据分析和半结构化数据的管理。
6. Efficient merging of data from multiple samples for determination of anomalous substructure [O] . David L. Akey, Thomas C. Terwilliger, Janet L. Smith -1

机译：有效合并来自多个样本的数据以确定异常的子结构
7. Efficient Substructure Discovery from Large Semi-structured Data [O] . Kenji Abe, Shinji Kawasoe, Hiroki Arimura, 2001

机译：从大型半结构化数据中发现有效的子结构

Efficient Substructure Discovery from Large Semi-Structured Data

摘要

著录项

相似文献

相关主题

期刊订阅