首页> 外文学位 >Sequential Bayesian Regression for Multiple Imputation and Conditional Editing.
【24h】

Sequential Bayesian Regression for Multiple Imputation and Conditional Editing.

机译:多重插补和条件编辑的顺序贝叶斯回归。

获取原文
获取原文并翻译 | 示例

摘要

Analysts faced with errors in data apply editing rules to fix erroneous data. These edits are deterministically assigned and edits may not be correct in all cases. This dissertation presents a unified method to multiply impute missing data and multiply edit erroneous data using a sequence of Bayesian regression models. The techniques used to multiply edit erroneous data are an exact parallel for multiple imputation used to correct missing data. The models presented allow for different data types subject to several error mechanisms.;This method is called Sequential Bayesian Regression for Multiple Imputation and Conditional Editing (SyBRMICE) and creates multiple fully imputed and edited data sets. Desired analyses are performed on each complete and consistently edited and imputed data set individually. Results from these analyses are combined using the same combining rules used in multiple imputation. The resulting parameter estimates and intervals will then correctly account for the errors incurred in both the data editing and imputation processes.;Development of SyBRMICE was motivated by data from Project Connect (PC). Project Connect was an 8 year longitudinal intervention study aiming to reduce teen pregnancy and STD rates in select middle and high schools in the Los Angeles area. Survey data was collected annually to measure the effectiveness of the interventions. A paper survey was administered to the students as a group in the classroom, and student responses have both missing and erroneous data.;The Project Connect survey was administered annually for five years. A subset of students participated in multiple years resulting in repeated answers to the same question by the same student. Data errors found in the PC survey data can be categorized as belonging to one of several error types. If a variable such as gender that should remain constant over time is observed to differ across surveys, this variable then is said to have an inconsistent longitudinal response. If a variable, such as age or ever having sexual intercourse, that should increase monotonically over time is observed to have a non-monotonic reporting pattern, this variable is then said to have an inconsistent monotonic longitudinal response. Lastly if the responses to two or more related variables give conflicting information, these variables are said to have an inconsistent multiple response.;Models to stochastically edit each of the three types of erroneous data are presented. The inconsistent repeated measures, inconsistent monotone longitudinal, and inconsistent multivariate models are developed separately and then combined as steps in an example of the larger unifying SyBRMICE procedure. The examples demonstrate the flexibility and customizability of the SyBRMICE procedure. Results from an analysis performed on the multiple complete and consistent data sets generated by the SyBRMICE procedure are compared to results from the same analysis performed on a single deterministically-edited, complete-case data set.
机译:面对数据错误的分析师会应用编辑规则来修复错误的数据。这些编辑是确定分配的,并且并非在所有情况下都正确。本文提出了一种统一的方法,利用一系列贝叶斯回归模型,对归因缺失数据进行乘积,对错误编辑数据进行乘积。用于乘以编辑错误数据的技术与用于纠正缺失数据的多重插补完全相同。提出的模型允许受多种错误机制影响的不同数据类型;该方法称为多重插补和条件编辑的顺序贝叶斯回归(SyBRMICE),并创建多个完全插补和编辑的数据集。对每个完整且一致编辑和估算的数据集分别进行所需的分析。这些分析的结果将使用多个插补中使用的相同合并规则进行合并。然后,所得的参数估计值和时间间隔将正确解决数据编辑和插补过程中出现的错误。; SyBRMICE的开发是由Project Connect(PC)的数据所推动的。 Project Connect是一项为期8年的纵向干预研究,旨在减少洛杉矶地区部分中学和中学的青少年怀孕率和性病率。每年收集调查数据以衡量干预措施的有效性。在教室里对学生们进行了书面调查,结果是学生的回答既缺失又有错误数据。Project Connect调查每年进行5年。一小部分学生参与了多年,导致同一学生对同一问题重复回答。在PC调查数据中发现的数据错误可以归类为属于几种错误类型之一。如果观察到随时间变化应保持不变的变量(例如性别)在各个调查中均不同,则称该变量的纵向响应不一致。如果观察到应该随时间单调增加的变量(例如年龄或曾经发生过性交)具有非单调报告模式,则可以说该变量具有不一致的单调纵向响应。最后,如果对两个或多个相关变量的响应给出了相互矛盾的信息,则这些变量被认为具有不一致的多重响应。提出了随机编辑三种错误数据中的每一种的模型。分别开发不一致的重复测量,不一致的单调纵向和不一致的多元模型,然后在较大的统一SyBRMICE程序示例中将其组合为步骤。这些示例说明了SyBRMICE过程的灵活性和可定制性。将对SyBRMICE程序生成的多个完整且一致的数据集进行分析的结果与对单个确定性编辑的完整案例数据集进行的相同分析的结果进行比较。

著录项

  • 作者

    Jeffries, Robin Angela.;

  • 作者单位

    University of California, Los Angeles.;

  • 授予单位 University of California, Los Angeles.;
  • 学科 Biology Biostatistics.
  • 学位 D.P.H.
  • 年度 2013
  • 页码 190 p.
  • 总页数 190
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号