首页> 外文会议>The 14th International Conference on QiR >The impact of different fold for cross validation of missing values imputation method on hepatitis dataset
【24h】

The impact of different fold for cross validation of missing values imputation method on hepatitis dataset

机译:不同倍数对遗漏值插补方法的交叉验证对肝炎数据集的影响

获取原文
获取原文并翻译 | 示例

摘要

Hepatitis is a liver disease caused by hepatitis viruses. Nowadays, hepatitis is a global health problem, including in Indonesia. Chronic hepatitis can lead to cirrhosis and liver cancer, therefore early diagnosis is needed. Several research works on development of computer aided systems have been conducted to improve the diagnosis process of hepatitis disease. California Irvine (UCI) machine-learning repository provides hepatitis disease dataset which can be publicly accessed; however, the dataset contains many missing values. The existing of missing values in the dataset may affect the quality of the results analysis. Therefore, it needs to be conducted for handling the missing values. This paper analyses the performance of applying varied number of fold for cross validation of missing values imputation methods. The imputation method is combined with the feature selection method and machine-learning algorithm on the hepatitis dataset. The results that varied fold in k-fold cross validation which applied in the imputation method does not reveal significant advantages.
机译:肝炎是由肝炎病毒引起的肝脏疾病。如今,肝炎已成为全球性健康问题,包括印度尼西亚在内。慢性肝炎可导致肝硬化和肝癌,因此需要早期诊断。已经进行了一些关于计算机辅助系统开发的研究工作,以改善肝炎疾病的诊断过程。加州尔湾(UCI)机器学习存储库提供了可以公开访问的肝炎疾病数据集;但是,数据集包含许多缺失值。数据集中缺失值的存在可能会影响结果分析的质量。因此,需要进行处理缺失值。本文分析了应用不同倍数进行缺失值插补方法的交叉验证的性能。在肝炎数据集上,将归因方法与特征选择方法和机器学习算法相结合。在插补方法中应用的k倍交叉验证中变化倍数的结果并没有显示出明显的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号