首页> 美国卫生研究院文献>Online Journal of Public Health Informatics >Using Change Point Detection for Monitoring the Quality of Aggregate Data
【2h】

Using Change Point Detection for Monitoring the Quality of Aggregate Data

机译:使用变更点检测来监视汇总数据的质量

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

IntroductionData consisting of counts or indicators aggregated from multiple sources pose particular problems for data quality monitoring when the users of the aggregate data are blind to the individual sources. This arises when agencies wish to share data but for privacy or contractual reasons are only able to share data at an aggregate level. If the aggregators of the data are unable to guarantee the quality of either the sources of the data or the aggregation process then the quality of the aggregate data may be compromised.This situation arose in the Distribute surveillance system (). Distribute was a national emergency department syndromic surveillance project developed by the International Society for Disease Surveillance for influenza-like-illness (ILI) that integrated data from existing state and local public health department surveillance systems, and operated from 2006 until mid 2012. Distribute was designed to work solely with aggregated data, with sites providing data aggregated from sources within their jurisdiction, and for which detailed information on the un-aggregated ‘raw’ data was unavailable. Previous work () on Distribute data quality identified several issues caused in part by the nature of the system: transient problems due to inconsistent uploads, problems associated with transient or long-term changes in the source make up of the reporting sites and lack of data timeliness due to individual site data accruing over time rather than in batch. Data timeliness was addressed using prediction intervals to assess the reliability of the partially accrued data (). The types of data quality issues present in the Distribute data are likely to appear to some extent in any aggregate data surveillance system where direct control over the quality of the source data is not possible. In this work we present methods for detecting both transient and long-term changes in the source data makeup.
机译:简介当汇总数据的用户对单个数据源视而不见时,由多个来源的计数或指标组成的数据会给数据质量监控带来特殊问题。当代理机构希望共享数据但出于隐私或合同原因只能共享汇总数据时,就会出现这种情况。如果数据的聚集者不能保证数据源或聚集过程的质量,则聚集数据的质量可能会受到影响。这种情况发生在分布式监视系统()中。 Distribute是国际疾病监测协会针对流感样疾病(ILI)制定的国家急诊部门综合症状监测项目,该项目整合了来自现有州和地方公共卫生部门监测系统的数据,并于2006年至2012年中期投入运行。设计仅用于汇总数据,站点提供从其管辖范围内的来源汇总的数据,并且无法获得有关未汇总的“原始”数据的详细信息。关于分布式数据质量的先前工作()确定了部分由系统性质引起的几个问题:由于上传不一致导致的暂时问题,与报告站点组成的源中的暂时或长期更改有关的问题以及数据不足及时性是由于各个站点数据是随着时间而不是成批累积的。使用预测间隔来评估数据及时性,以评估部分累积的数据的可靠性()。在不可能直接控制源数据质量的任何聚合数据监视系统中,分布式数据中存在的数据质量问题的类型可能会在某种程度上出现。在这项工作中,我们提出了用于检测源数据构成中的瞬时变化和长期变化的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号