首页> 外文期刊>BMC Medical Informatics and Decision Making >Data cleaning process for HIV-indicator data extracted from DHIS2 national reporting system: a case study of Kenya
【24h】

Data cleaning process for HIV-indicator data extracted from DHIS2 national reporting system: a case study of Kenya

机译:从DHIS2国家报告系统提取的艾滋病毒指标数据的数据清洁过程:肯尼亚的一个案例研究

获取原文
           

摘要

The District Health Information Software-2 (DHIS2) is widely used by countries for national-level aggregate reporting of health-data. To best leverage DHIS2 data for decision-making, countries need to ensure that data within their systems are of the highest quality. Comprehensive, systematic, and transparent data cleaning approaches form a core component of preparing DHIS2 data for analyses. Unfortunately, there is paucity of exhaustive and systematic descriptions of data cleaning processes employed on DHIS2-based data. The aim of this study was to report on methods and results of a systematic and replicable data cleaning approach applied on HIV-data gathered within DHIS2 from 2011 to 2018 in Kenya, for secondary analyses. Six programmatic area reports containing HIV-indicators were extracted from DHIS2 for all care facilities in all counties in Kenya from 2011 to 2018. Data variables extracted included reporting rate, reporting timeliness, and HIV-indicator data elements per facility per year. 93,179 facility-records from 11,446 health facilities were extracted from year 2011 to 2018. Van den Broeck et al.’s framework, involving repeated cycles of a three-phase process (data screening, data diagnosis and data treatment), was employed semi-automatically within a generic five-step data-cleaning sequence, which was developed and applied in cleaning the extracted data. Various quality issues were identified, and Friedman analysis of variance conducted to examine differences in distribution of records with selected issues across eight years. Facility-records with no data accounted for 50.23% and were removed. Of the remaining, 0.03% had over 100% in reporting rates. Of facility-records with reporting data, 0.66% and 0.46% were retained for voluntary medical male circumcision and blood safety programmatic area reports respectively, given that few facilities submitted data or offered these services. Distribution of facility-records with selected quality issues varied significantly by programmatic area (p??0.001). The final clean dataset obtained was suitable to be used for subsequent secondary analyses. Comprehensive, systematic, and transparent reporting of cleaning-process is important for validity of the research studies as well as data utilization. The semi-automatic procedures used resulted in improved data quality for use in secondary analyses, which could not be secured by automated procedures solemnly.
机译:区域健康信息软件-2(DHIS2)被国家级别的卫生数据的国家级别报告广泛使用。为了最佳利用DHIS2的决策数据,各国需要确保其系统内的数据具有最高质量。全面,系统和透明的数据清洁方法形成准备DHIS2数据进行分析的核心组成部分。不幸的是,基于DHIS2的数据采用的数据清洁过程有令人遗憾的和系统描述。本研究的目的是报告在2011年至2018年在肯尼亚举办的艾滋病病毒数据上申请的系统和可复制的数据清洗方法的方法和结果,用于次次分析。从2011年到2018年从肯尼亚的所有县的所有CARE设施中提取含有艾滋病毒指标的六个编程区域报告。提取数据变量包括每年每个设施的报告率,报告及时性和艾滋病毒指示剂数据元素。从2011年至2018年提取了来自11,446份卫生设施的93,179个设施记录。Van den Broeck等人。涉及三相过程的重复周期(数据筛查,数据诊断和数据处理)的框架,是半的在通用五步数据清洁序列中自动开发并应用于清洁提取的数据。确定了各种质量问题,并弗里德曼对八年后选定问题分布差异的差异。没有数据的设施记录占50.23%并被删除。在剩余的情况下,报告率的0.03%超过100%。在报告数据的设施记录中,分别保留了0.66%和0.46%,分别保留了自愿医疗男性割礼和血液安全计划区域报告,鉴于少数设施提交数据或提供这些服务。具有选定质量问题的设施记录的分布由程序化区域显着变化(P?<0.001)。获得的最终清洁数据集适用于随后的二次分析。清洁过程的全面,系统和透明的报告对于研究研究的有效性以及数据利用而言是重要的。使用的半自动程序导致了用于次要分析的数据质量,这些数据质量可以庄严地通过自动化程序确保。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号