首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Order-Sensitive Imputation for Clustered Missing Values
【24h】

Order-Sensitive Imputation for Clustered Missing Values

机译:聚类缺失值的阶数敏感性推算

获取原文
获取原文并翻译 | 示例

摘要

The issue of missing values (MVs) has appeared widely in real-world datasets and hindered the use of many statistical or machine learning algorithms for data analytics due to their incompetence in handling incomplete datasets. To address this issue, several MV imputation algorithms have been developed. However, these approaches do not perform well when most of the incomplete tuples are clustered with each other, coined here as theClustered Missing Values Phenomenon, which attributes to the lack of sufficient complete tuples near an MV for imputation. In this paper, we propose theOrder-Sensitive Imputation for Clustered Missing values(OSICM) framework, in which missing values are imputed sequentially such that the values filled earlier in the process are also used for later imputation of other MVs. Obviously, the order of imputations is critical to the effectiveness and efficiency of OSICM framework. We formulate the searching of the optimal imputation order as an optimization problem, and show its NP-hardness. Furthermore, we devise an algorithm to find the exact optimal solution and propose two approximate/heuristic algorithms to trade off effectiveness for efficiency. Finally, we conduct extensive experiments on real and synthetic datasets to demonstrate the superiority of our OSICM framework.
机译:缺失值(MV)的问题已在现实世界的数据集中广泛出现,并且由于它们无法处理不完整的数据集而阻碍了许多统计或机器学习算法用于数据分析。为了解决这个问题,已经开发了几种MV插补算法。但是,当大多数不完整的元组彼此聚在一起时,这些方法效果不佳,在这里被称为 n 集群的缺失值现象 n,这归因于在MV插值附近缺少足够的完整元组。在本文中,我们提出了 n <斜体xmlns:mml = “ http://www.w3.org/1998/Math/MathML ” xmlns:xlink = “ http://www.w3.org/ 1999 / xlink “>针对群集缺失值的顺序敏感插补 n(OSICM)框架,在该框架中,按顺序插补缺失值,以便在过程中较早填充的值也可用于以后对其他MV进行插补。显然,插补顺序对于OSICM框架的有效性和效率至关重要。我们将最优插补顺序的搜索公式化为一个优化问题,并显示其NP硬度。此外,我们设计了一种算法来找到精确的最佳解,并提出了两种近似/启发式算法来权衡效率。最后,我们对真实和合成数据集进行了广泛的实验,以证明我们OSICM框架的优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号