An Efficient Map-Reduce Algorithm for Computing Formal Concepts from Binary data

机译：一种高效地图 - 减少从二进制数据计算正式概念的映射算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The problem of discovering all formal concepts embedded in a binary relational dataset is of significant interest for many data analysis and processing problems. The problem of enumerating all concepts for a dataset is known to be NP-hard. A number of Map-Reduce based algorithms have been developed to conquer the difficulty of processing large datasets. But these algorithms are not very scalable because at core all these algorithms stick to a DFS based sequential search for individual concepts, applying Map-Reduce formalism to parallelize the processing at nodes of the DFS tree. We have present here a completely different and Map-Reduce based formulation for parallelizing the concept discovery problem. One major difference of our approach is that we seek to find a sufficient set of concepts only, and this sufficient set can be used to generate all other concepts in the lattice. This formulation is not very suitable for sequential execution but adapts extremely well to the parallel environments based on Map-Reduce operators. We show in this paper that our algorithm is significantly faster than all the known Map-reduce formulations for discovering concepts in binary relational datasets. We have presented in this paper the outline of the theoretical foundations for our algorithm and empirical tests with a number of benchmarking datasets. We also show that the computationally very difficult problem of finding 3-clusters in pairs of binary relational datasets can also be made very efficient by our formulation.

机译：发现嵌入在二进制关系数据集中的所有正式概念的问题对于许多数据分析和处理问题来说是重大兴趣。已知枚举数据集的所有概念的问题是NP-HARD。已经开发了许多地图减少基于算法以征服处理大型数据集的难度。但是这些算法不是非常可扩展的，因为在核心All这些算法中，所有这些算法都基于基于DFS的顺序搜索各个概念，应用MAP-DENECTIONA中的形式主义并将其并行化DFS树节点的处理。我们在这里介绍了一个完全不同的和地图 - 减少了基于概念发现问题的并行化的基础配方。我们的方法的一个主要区别是我们寻求发现足够的概念，并且这种充足的集合可用于在格子中产生所有其他概念。该配方不是非常适合顺序执行，但基于地图减少运算符对并行环境非常良好。我们在本文中展示了我们的算法明显比所有已知的地图减少用于在二进制关系数据集中发现概念的制剂的速度更快。我们在本文中提出了我们算法的理论基础的大纲和具有许多基准测试数据集的实证测试。我们还表明，通过我们的配方可以非常有效地在二元关系数据集成对找到3集群的计算上非常困难的问题。

著录项

来源
《IEEE International Conference on Big Data》|2015年||共10页
会议地点
作者
Raj Bhatnagar; Lalit Kumar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP393-53;
关键词

相似文献

外文文献
中文文献
专利

1. Data mining algorithms to compute mixed concepts with negative attributes: an application to breast cancer data analysis [J] . Manuel Rodriguez-Jimenez Jose, Cordero Pablo, Enciso Manuel, Mathematical Methods in the Applied Sciences . 2016,第16期

机译：数据挖掘算法，用于计算具有负属性的混合概念：在乳腺癌数据分析中的应用
2. Scalable formal concept analysis algorithms for large datasets using Spark [J] . Chunduri Raghavendra K., Cherukuri Aswani Kumar Journal of ambient intelligence and humanized computing . 2019,第11期

机译：使用Spark对大型数据集进行可扩展的形式概念分析算法
3. Data Reduction for Boolean Matrix Factorization Algorithms Based on Formal Concept Analysis [J] . Martin Trnecka, Marketa Trneckova Knowledge-Based Systems . 2018,第OCTa15期

机译：基于形式概念分析的布尔矩阵分解算法的数据约简
4. An efficient map-reduce algorithm for computing formal concepts from binary data [C] . Bhatnagar Raj, Kumar Lalit IEEE International Congress on Big Data . 2015

机译：一种用于从二进制数据计算形式概念的高效映射减少算法
5. A formal concept analysis approach to association rule mining: The QuICL algorithms. [D] . Smith, David T. 2009

机译：关联规则挖掘的正式概念分析方法：QuICK算法。
6. Formal Medical Knowledge Representation Supports Deep Learning Algorithms Bioinformatics Pipelines Genomics Data Analysis and Big Data Processes [O] . Ferdinand Dhombres, Jean Charlet 2019

机译：正式的医学知识表示支持深度学习算法生物信息学管道基因组学数据分析和大数据过程
7. Needed: An efficient algorithm for computing exact tail probabilities of randomization tests for treatment effects in a randomized block experiment with binary or trinary data [O] . Robson D. S., Cornell University. Biometrics Unit., Cornell University. Dept. of Biometrics., 1980

机译：需要：一种有效的算法，用于计算具有二值或三值数据的随机块实验中用于治疗效果的随机化测试的确切尾部概率
8. Efficient algorithms for dilated mappings of binary trees [R] . Iqbal, M. Ashraf 1990

机译：二叉树的扩张映射的有效算法

An Efficient Map-Reduce Algorithm for Computing Formal Concepts from Binary data

摘要

著录项

相似文献

相关主题

期刊订阅