Fractal Dimension Calculation for Big Data Using Box Locality Index

Rong Liu; Robert Rallo; Yoram Cohen

首页> 外文期刊>Annals Data Science >Fractal Dimension Calculation for Big Data Using Box Locality Index

【24h】

Fractal Dimension Calculation for Big Data Using Box Locality Index

机译：利用盒局部性指数计算大数据的分形维数

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The box - counting approach for fractal dimension calculation is scaled up for big data using a data structure named box locality index (BLI). The BLI is constructed as key-value pairs with the key indexing the location of a “box” (i.e., a grid cell on the multi-dimensional space) and the value counting the number of data points inside the box (i.e., “box occupancy”). Such a key-value pair structure of BLI significantly simplifies the traditionally used hierarchical structure and encodes only necessary information required by the box - counting approach for fractal dimension calculation. Moreover, as the box occupancies (i.e., the values) associated with the same index (i.e., the key) are aggregatable, the BLI grants the box - counting approach the needed scalability for fractal dimension calculation of big data using distributed computing techniques (e.g., MapReduce and Spark). Taking the advantage of the BLI, MapReduce and Spark methods for fractal dimension calculation of big data are developed, which conduct box - counting for each grid level as a cascade of MapReduce/Spark jobs in a bottom-up fashion. In an empirical validation, the MapReduce and Spark methods demonstrated good effectiveness and efficiency in fractal calculation of a big synthetic dataset. In summary, this work provides an efficient solution for estimating the intrinsic dimension of big data, which is essential for many machine learning methods and data analytics including feature selection and dimensionality reduction.

机译：使用名为盒局部性索引（BLI）的数据结构，针对大数据扩大了用于分形维数计算的盒计数方法。 BLI被构造为键-值对，其中的键索引了“框”（即多维空间上的网格单元）的位置，并为框内的数据点数量计数的值（即“框”占用”）。 BLI的这种键值对结构大大简化了传统上使用的层次结构，并且仅对分形维数计算的盒计数方法所需的必要信息进行编码。此外，由于与同一索引（即键）关联的框占用（即值）是可聚合的，因此BLI授予框计数方法使用分布式计算技术（例如，大数据的分形维计算）所需的可伸缩性。，MapReduce和Spark）。利用BLI的优势，开发了MapReduce和Spark方法来进行大数据的分形维数计算，该方法以自下而上的方式对每个网格级别进行盒装计数，作为MapReduce / Spark作业的级联。通过经验验证，MapReduce和Spark方法在大型综合数据集的分形计算中显示出良好的有效性和效率。总而言之，这项工作为估算大数据的内在维度提供了一种有效的解决方案，这对于许多机器学习方法和数据分析（包括特征选择和降维）都是必不可少的。

著录项

来源
《Annals Data Science》 |2018年第4期|549-563|共15页
作者
Rong Liu; Robert Rallo; Yoram Cohen;
展开▼
作者单位

Institute of the Environment and Sustainability, University of California;

Advanced Computing, Mathematics, and Data Division, Pacific Northwest National Laboratory;

Institute of the Environment and Sustainability, University of California,Chemical and Biomolecular Engineering Department, University of California,Center for Environmental Implications of Nanotechnology, University of California;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Fractal dimension; Intrinsic dimension; Box-counting; Box locality index; MapReduce; Spark;

机译：分形维数;本征维数;盒计数;盒局部性指数;MapReduce;火花;

相似文献

外文文献
中文文献
专利

1. BCFD - a Visual Basic program for calculation of the fractal dimension of digitized geological image data using a box-counting technique [J] . Timotej VERBOVSEK Geological Quarterly . 2009,第2期

机译：BCFD-一种Visual Basic程序，用于使用盒计数技术计算数字化地质图像数据的分形维数
2. BCFD — a Visual Basic program for calculation of the fractal dimension of digitized geological image data using a box-counting technique [J] . Timotej VERBOVSEK Geological Quarterly . 2009,第2期

机译：BCFD —一种Visual Basic程序，使用盒计数技术来计算数字化地质图像数据的分形维数
3. Correlation Based Fractal Dimension Calculation for Fractal Images Using Box Counting Technique [J] . S. Sukumaran, M. Punithavalli Advances in computational sciences and technology . 2009,第3期

机译：基于盒计数技术的基于相关性的分形图像分形维数计算
4. A Deterministic Box-covering Algorithm for Fractal Dimension Calculation of Complex Networks [C] . Fengjun Gong, Yan Li, Daduan Zhao, IEEE Data Driven Control and Learning Systems Conference . 2020

机译：复杂网络分形尺寸计算的确定性盒覆盖算法
5. Evolutionary design of two-dimensional fractals and three-dimensional plant structures for computer graphics. [D] . Yu, Jing. 2004

机译：用于计算机图形学的二维分形和三维植物结构的进化设计。
6. Calculation of multi-fractal dimensions in spin chains [O] . Y. Y. Atas, E. Bogomolny -1

机译：自旋链中多重分形维数的计算
7. Placeholder Substructures II: Meta-Fractals, Made of Box-Kites, Fill Infinite-Dimensional Skies [O] . de Marrais, Robert P. C. 2007

机译：占位符子结构II：元分形，由箱式风筝制成，填充无限维天空

Fractal Dimension Calculation for Big Data Using Box Locality Index

摘要

著录项

相似文献

相关主题

期刊订阅