...
首页> 外文期刊>Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis >Missing Categorical Data Imputation and Individual Observation Level Imputation
【24h】

Missing Categorical Data Imputation and Individual Observation Level Imputation

机译:分类数据归因缺失和个人观察水平归因

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Traditional missing data techniques of imputation schemes focus on prediction of the missing value based on other observed values. In the case of continuous missing data the imputation of missing values often focuses on regression models. In the case of categorical data, usual techniques are then focused on classification techniques which sets the missing value to the ‘most likely’ category. This however leads to overrepresentation of the categories which are in general observed more often and hence can lead to biased results in many tasks especially in the case of presence of dominant categories. We present original methodology of imputation of missing values which results in the most likely structure (distribution) of the missing data conditional on the observed values. The methodology is based on the assumption that the categorical variable containing the missing values has multinomial distribution. Values of the parameters of this distribution are than estimated using the multinomial logistic regression. Illustrative example of missing value and its reconstruction of the highest education level of persons in some population is described.
机译:插补方案的传统缺失数据技术着重于基于其他观测值的缺失值预测。在连续缺失数据的情况下,缺失值的估算通常集中在回归模型上。对于分类数据,通常的技术将集中于分类技术,该技术将缺失值设置为“最可能”类别。但是,这会导致通常会经常观察到的类别的过多代表,因此,在许多任务中,尤其是在存在主导类别的情况下,可能导致结果偏差。我们提出了缺失值的推算的原始方法,该方法导致了以观测值为条件的缺失数据的最可能结构(分布)。该方法基于以下假设:包含缺失值的分类变量具有多项式分布。然后使用多项逻辑回归来估计该分布的参数值。描述了缺失价值及其对某些人群最高文化程度的重建的示例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号