首页> 外文会议>International Conference on Data Management Technologies and Applications >A Visual Technique to Assess the Quality of Datasets: Understanding the Structure and Detecting Errors and Missing Values in Open Data CSV Files
【24h】

A Visual Technique to Assess the Quality of Datasets: Understanding the Structure and Detecting Errors and Missing Values in Open Data CSV Files

机译:一种评估数据集质量的可视技术:了解结构和检测打开数据CSV文件中的错误和缺失值

获取原文

摘要

Nowadays, more and more information is flowing in and is provided on the Web. Large datasets are made available covering many fields and sectors. Open Data (OD) plays an important role in this field. Thanks to the volumes and the variety of the released datasets, OD brings high societal and business potential. In order to realize this potential, the reuse of the datasets (e.g. in internal business processes) becomes primordial. However, if the aim is to reuse OD, it is also necessary to be able of assessing its quality. This paper demonstrates how Information Visualization may help on this task and presents Stacktab chart - a new chart to analyse and assess CSV files in order to understand their structure, identify the location of relevant information and detect possible problems in the datasets.
机译:如今,越来越多的信息流入并在网上提供。大型数据集可用涵盖许多字段和扇区。开放数据(OD)在此领域中发挥着重要作用。由于卷和释放数据集的各种,OD带来了高社会和业务潜力。为了实现这种潜力,数据集重用(例如,在内部业务流程中)成为原始的。但是,如果目的是重复使用OD,也需要评估其质量。本文演示了信息可视化如何有关此任务的帮助,并呈现StackTab图表 - 要分析和评估CSV文件的新图表,以便了解其结构,确定相关信息的位置并检测数据集中可能的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号