Compressing Closed Frequent Itemsets with Controlled Information Loss

机译：在控制信息丢失的情况下压缩封闭的频繁项目集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Closed frequent itemsets (CFIs) condense frequent itemsets without loss of information. For large and dense datasets like big data and unbound big data streams, even the number of CFIs generated can be enormous. In such scenarios approximation is preferred against an accurate solution. Subset Significance Threshold (SST) is an effective constraint variable in mining significant CFIs. The support of the insignificant CFIs is approximated to the support of their immediate superset. However, few insignificant CFIs are approximated beyond specified SST due to chaining effect. To overcome this limitation in SST, the authors are proposing an enhancement to the SST (e-SST) in this paper to improve the degree of accuracy of the approximated insignificant CFIs. The merging of insignificant CFIs to thier superset is limited to one level so that the approximation is bound within specified SST. Experimental results show that the e-SST technique is efficient than SST in limiting the approximation of the support of insignificant CFIs within the specified threshold, thus reducing the information loss.

机译：封闭的频繁项集（CFI）可以在不丢失信息的情况下压缩频繁项集。对于大数据和密集数据集（例如大数据和未绑定的大数据流），甚至生成的CFI数量也可能非常庞大。在这种情况下，最好采用近似而不是精确的解决方案。子集重要性阈值（SST）是挖掘重要CFI时的有效约束变量。无关紧要的CFI的支持近似于其直接超集的支持。但是，由于连锁效应，很少有不重要的CFI近似超出指定的SST。为了克服SST中的这一限制，作者在本文中建议对SST（e-SST）进行增强，以提高近似无关紧要的CFI的准确性。无关紧要的CFI与它们的超集的合并被限制为一个级别，以便将近似值绑定在指定的SST内。实验结果表明，e-SST技术比SST效率高，可以将微不足道的CFI支持的近似值限制在指定的阈值之内，从而减少了信息丢失。

著录项

来源
《IEEE International Conference on Cloud Computing in Emerging Markets》|2019年|69-74|共6页
会议地点
作者
Pavitra Bai S; Ravikumar G K; Narendra B K;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Big Data; data mining;

机译：大数据;数据挖掘;

相似文献

外文文献
中文文献
专利

1. EFFICIENT SUBSET-LATTICE ALGORITHMS FOR MINING CLOSED FREQUENT ITEMSETS AND MAXIMAL FREQUENT ITEMSETS IN DATA STREAMS [J] . Ye-In Chang, Chia-En Li, Wei-Hau Peng, International Journal of Electrical Engineering: Transactions of the Chinese Institute of Engineers, Series E . 2013,第2期

机译：高效的子格算法，用于挖掘数据流中的封闭频率项和最大频率项
2. Geo Map Visualization for Frequent Purchaser in Online Shopping Database Using an Algorithm LP-Growth for Mining Closed Frequent Itemsets [J] . M. Sinthuja, N. Puviarasan, P. Aruna Procedia Computer Science . 2018,第1期

机译：使用算法LP-Growth挖掘封闭式频繁项目集的在线购物数据库中频繁购买者的地理地图可视化
3. Mining frequent, maximal and closed frequent itemsets over data stream - a review [J] . M. Jeya Sutha, F. Ramesh Dhanaseelan International journal of data analysis techniques and strategies . 2017,第1期

机译：通过数据流挖掘频繁，最大和关闭频繁项目集
4. Compressing Closed Frequent Itemsets with Controlled Information Loss [C] . Pavitra Bai S, Ravikumar G K, Narendra B K IEEE International Conference on Cloud Computing in Emerging Markets . 2019

机译：压缩具有受控信息丢失的封闭式频繁项目集
5. Communications Between Air Traffic Controllers and Pilots During Simulated Arrivals: Relation of Closed Loop Communication Deviations to Loss of Separation [D] . ?Lieber, Christopher 2020

机译：模拟到达期间空中交通管制员和飞行员之间的通信：闭环通信偏差与分离损失的关系
6. Bit-Table Based Biclustering and Frequent Closed Itemset Mining in High-Dimensional Binary Data [O] . András Király, Attila Gyenesei, János Abonyi -1

机译：高位二进制数据中基于位表的聚类和频繁封闭项集挖掘
7. SURVEY CONDUCTED ON ALGORITHMS FOR FINDING FREQUENT AND CLOSED FREQUENT ITEMSET WITH INCREMENTAL APPROACH [O] . 2015

机译：在算法上进行的调查，用于查找具有增量方法的频繁和封闭频繁的项目集

Compressing Closed Frequent Itemsets with Controlled Information Loss

摘要

著录项

相似文献

相关主题

期刊订阅