首页> 外文会议>International Conference on Data Science, Technology and Applications >Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach
【24h】

Data Quality in Secondary Data Analysis: A Case Study of Ecological Data using a Semiotic-based Approach

机译:二次数据分析中的数据质量:使用基于符号的方法的生态数据的案例研究

获取原文

摘要

Data quality problems are widespread in secondary data when they are used for data warehousing and data mining. This paper advocates a broad semiotic approach to data quality. The main premises of this expanded semiotic framework are (1) data represent some reality, (2) data are created and interpreted by humans in a communication process, (3) data are used for specific purposes by humans, and (4) data cannot be created, interpreted and used without knowledge. Thus, the semiotic-based approach to data quality in secondary data analysis has four aspects: (1) representational, (3) communicational, (3) pragmatic, and (4) knowledge-based. To illustrate these four characteristics, we present a case study of ecological data analysis used in the creation of an ornithological data warehouse. We discuss the temporal data (ecological notion of time), spatial ecological data (communication processes and protocols used for data collection), and bioacoustic data processing (domain knowledge needed for the specification of data provenance).
机译:当它们用于数据仓库和数据挖掘时,数据质量问题在辅助数据中普及。本文倡导着卓越的符号方法来数据质量。这种扩展的符号框架的主要场所是(1)数据代表一些现实,(2)通过人类在通信过程中创建和解释数据,(3)数据用于人类的特定目的,(4)数据不能用于没有知识创建,解释和使用。因此,次要数据分析中的基于符号的数据质量方法具有四个方面:(1)代表性,(3)通信,(3)务实,(4)基于知识。为了说明这四种特征,我们展示了用于创建鸟类数据仓库的生态数据分析的案例研究。我们讨论时间数据(时间概念),空间生态数据(用于数据收集的通信过程和协议)以及生物声学数据处理(数据出处的规范所需的域知识)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号