首页> 外文期刊>International journal of pervasive computing and communications >Early prediction of chronic disease using an efficient machine learning algorithm through adaptive probabilistic divergence based feature selection approach
【24h】

Early prediction of chronic disease using an efficient machine learning algorithm through adaptive probabilistic divergence based feature selection approach

机译:基于自适应概率分歧的特征选择方法,使用高效机床学习算法早期预测慢性疾病

获取原文
获取原文并翻译 | 示例
           

摘要

Purpose - According to the World Health Organization, by 2025, the contribution of chronic disease is expected to rise by 73% compared to all deaths and it is considered as global burden of disease with a rate of 60%. These diseases persist for a longer duration of time, which are almost incurable and can only be controlled. Cardiovascular disease, chronic kidney disease (CKD) and diabetes mellitus are considered as three major chronic diseases that will increase the risk among the adults, as they get older. CKD is considered a major disease among all these chronic diseases, which will increase the risk among the adults as they get older. Overall 10% of the population of the world is affected by CKD and it is likely to double in the year 2030. The paper aims to propose novel feature selection approach in combination with the machine-learning algorithm which can early predict the chronic disease with utmost accuracy. Hence, a novel feature selection adaptive probabilistic divergence-based feature selection (APDFS) algorithm is proposed in combination with the hyper-parameterized logistic regression model (HLRM) for the early prediction of chronic disease. Design/methodology/approach - A novel feature selection APDFS algorithm is proposed which explicitly handles the feature associated with the class label by relevance and redundancy analysis. The algorithm applies the statistical divergence-based information theory to identify the relationship between the distant features of the chronic disease data set The data set required to experiment is obtained from several medical labs and hospitals in India. The HLRM is used as a machine-learning classifier. The predictive ability of the framework is compared with the various algorithm and also with the various chronic disease data set. The experimental result illustrates that the proposed framework is efficient and achieved competitive results compared to the existing work in most of the cases. Findings - The performance of the proposed framework is validated by using the metric such as recall, precision, F1 measure and ROC. The predictive performance of the proposed framework is analyzed by passing the data set belongs to various chronic disease such as CKD, diabetes and heart disease. The diagnostic ability of the proposed approach is demonstrated by comparing its result with existing algorithms. The experimental figures illustrated that the proposed framework performed exceptionally well in prior prediction of CKD disease with an accuracy of 91.6. Originality/value - The capability of the machine learning algorithms depends on feature selection (FS) algorithms in identifying the relevant traits from the data set, which impact the predictive result. It is considered as a process of choosing the relevant features from the data set by removing redundant and irrelevant features. Although there are many approaches that have been already proposed toward this objective, they are computationally complex because of the strategy of following a one-step scheme in selecting the features. In this paper, a novel feature selection APDFS algorithm is proposed which explicitly handles the feature associated with the class label by relevance and redundancy analysis. The proposed algorithm handles the process of feature selection in two separate indices. Hence, the computational complexity of the algorithm is reduced to O(n~(k+1)). The algorithm applies the statistical divergence-based information theory to identify the relationship between the distant features of the chronic disease data set. The data set required to experiment is obtained from several medical labs and hospitals of karkala taluk , India. The HLRM is used as a machine learning classifier. The predictive ability of the framework is compared with the various algorithm and also with the various chronic disease data set. The experimental result illustrates that the proposed framework is efficient and achieved competitive results are compared to the existing work in most of the cases.
机译:目的 - 根据世界卫生组织,到2025年,与所有死亡相比,慢性病的贡献预计将增加73%,并且被视为全球疾病负担,率为60%。这些疾病持续持续时间较长的时间,这几乎不可抵押,只能控制。心血管疾病,慢性肾脏疾病(CKD)和糖尿病被认为是三种主要的慢性病,​​这将增加成年人的风险,因为它们变老了。 CKD在所有这些慢性病中被认为是一个重大疾病,这将增加成年人之间的风险,因为他们变老了。总体上10%的世界人口受到CKD的影响,它可能会在2030年增加一倍。该文件旨在提出新颖的特征选择方法与机器学习算法相结合,可以早期预测慢性疾病准确性。因此,提出了一种基于新的特征选择自适应概率分歧的特征选择(APDFS)算法,与用于慢性疾病的早期预测的超参数化物流回归模型(HLRM)组合。设计/方法/方法 - 提出了一种新颖的特征选择APDFS算法,其通过相关性和冗余分析明确地处理与类标签相关联的功能。该算法应用基于统计分歧的信息理论,以识别慢性疾病数据集之间的远程特征与实验所需的数据集之间的关系,从印度的几家医疗实验室和医院获得。 HLRM用作机器学习分类器。将框架的预测能力与各种算法和各种慢性疾病数据集进行比较。实验结果表明,与大多数情况下的现有工作相比,所提出的框架是有效和竞争的结果。调查结果 - 通过使用召回,精度,F1测量和ROC等度量来验证所提出的框架的性能。通过通过数据集分析所提出的框架的预测性能属于诸如CKD,糖尿病和心脏病等各种慢性疾病。通过将其结果与现有算法进行比较来证明所提出的方法的诊断能力。实验图示出了所提出的框架在预测CKD疾病的前预测的情况下,精度为91.6。原创性/值 - 机器学习算法的能力取决于特征选择(FS)算法在识别来自数据集中的相关特征的特征选择(FS)算法影响预测结果。它被认为是通过去除冗余和无关的功能来选择来自数据集的相关特征的过程。尽管已经提出了许多方法,但是在这种目标上已经提出了许多方法,因此由于在选择特征方面的一步方案中的策略,它们是计算复杂的。在本文中,提出了一种新颖的特征选择APDFS算法,其通过相关性和冗余分析明确地处理与类标签相关联的功能。该算法在两个单独的指标中处理特征选择的过程。因此,算法的计算复杂度减少到O(n〜(k + 1))。该算法应用基于统计分歧的信息理论,以识别慢性疾病数据集的远处特征之间的关系。实验所需的数据集是从印度的Karkala Taluk的几家医疗实验室和医院获得。 HLRM用作机器学习分类器。将框架的预测能力与各种算法和各种慢性疾病数据集进行比较。实验结果表明,拟议的框架是有效的,达到竞争力的结果与大多数情况下的现有工作进行比较。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号