Incremental Sorting for Large Dynamic Data Sets

机译：大型动态数据集的增量排序

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In today's world of pervasive computing, it is straightforward for organizations to generate large amounts of data in support of a variety of business needs. For this reason, it is important to build tools that allow analysts to manage and investigate these data sets quickly and efficiently. One feature needed by these tools is the ability to sort large amounts of data along a number of dimensions to facilitate the search for useful information. In this paper, we describe a new method for incrementally sorting large, multi-dimensional, dynamic data sets. Our particular use case involves sorting large Twitter data sets but our technique can be applied more generally across a variety of data types. Our approach is evaluated with respect to its scalability and by comparing it to several alternatives. It is currently able to efficiently sort data sets consisting of tens of millions of tweets along a variety of dimensions even when the data set is under active collection and new tweets are being added each day. The approach incrementally integrates the new tweets and provides sorted views of all tweets along various dimensions without having to re-sort the previously sorted tweets. The paper presents the benefits of the technique, discusses its limitations, and describes its software engineering contributions.

机译：在当今的普适计算世界中，组织可以轻松生成大量数据以支持各种业务需求。因此，构建使分析人员能够快速有效地管理和调查这些数据集的工具非常重要。这些工具所需的功能之一是能够沿多个维度对大量数据进行分类以促进对有用信息的搜索。在本文中，我们描述了一种对大型，多维，动态数据集进行增量排序的新方法。我们的特定用例涉及对大型Twitter数据集进行排序，但是我们的技术可以更广泛地应用于各种数据类型。通过对我们的方法的可伸缩性进行评估，并将其与其他几种方案进行比较。当前，即使数据集处于活动收集状态并且每天都在添加新的tweet，它也能够有效地对包含数百万条tweet的数据集进行各种维度的排序。该方法渐进地集成了新推文，并提供了各个维度上所有推文的排序视图，而无需重新排序先前已排序的推文。本文介绍了该技术的好处，讨论了其局限性，并描述了其软件工程方面的贡献。

著录项

来源
《IEEE International Conference on Big Data Computing Service and Applications》|2015年|170-175|共6页
会议地点
作者
Aydin Ahmet Arif; Anderson Kenneth M.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
big data; dynamic data sets; incremental sorting;

机译：大数据;动态数据集;增量排序;

相似文献

外文文献
中文文献
专利

1. Incremental attribute reduction with rough set for dynamic datasets with simultaneously increasing samples and attributes [J] . Dong Lianjie, Chen Degang International journal of machine learning and cybernetics . 2020,第6期

机译：具有用于动态数据集的粗糙集的增量属性，同时增加样本和属性
2. Fuzzy rough set based incremental attribute reduction from dynamic data with sample arriving [J] . Yang Yanyan, Chen Degang, Wang Hui, Fuzzy sets and systems . 2017,第APRa1期

机译：具有样本到达的动态数据基于模糊粗糙集的增量属性约简
3. Incremental feature selection based on rough set in dynamic incomplete data [J] . Wenhao Shu, Hong Shen Pattern Recognition: The Journal of the Pattern Recognition Society . 2014,第12期

机译：动态不完整数据中基于粗糙集的增量特征选择
4. Incremental Sorting for Large Dynamic Data Sets [C] . Aydin Ahmet Arif, Anderson Kenneth M. IEEE International Conference on Big Data Computing Service and Applications . 2015

机译：用于大动态数据集的增量排序
5. Incrementally Sorted Lattice Data Structures. [D] . Obiedat, Mohammad. 2015

机译：递增排序的晶格数据结构。
6. Optimal set of grid size and angular increment for practical dose calculation using the dynamic conformal arc technique: a systematic evaluation of the dosimetric effects in lung stereotactic body radiation therapy [O] . Ji-Yeon Park, Siyong Kim, Hae-Jin Park, 2014

机译：使用动态共形弧技术计算实际剂量的最佳栅格尺寸和角度增量集：对肺立体定向放射治疗中剂量学效应的系统评估
7. Figure 3: Snapshots of sorting results from three different permuted data sets, each of which contains 100 numbers. (A) The time steps of the sorting result of data set 1. (B) The time steps of the sorting result of data set 2. (C) The time steps of the sorting result of data set 3. [O] . -1

机译：图3：三种不同置换数据集的排序结果的快照，每个数据集包含100个数字。（a）数据集的排序结果的时间步骤1.（b）数据集2的排序结果的时间步长。（c）数据集3的排序结果的时间步长。
8. Efficient and Exact Visibility Sorting of Zoo-Mesh Data Sets [R] . Cook, R., Max, N., Silva, C., 2001

机译：Zoo-mesh数据集的高效精确可见性排序

Incremental Sorting for Large Dynamic Data Sets

摘要

著录项

相似文献

相关主题

期刊订阅