首页> 外文期刊>International Journal of Information and Communication Technology Research >Application of k- Nearest Neighbour Classification in Medical Data Mining
【24h】

Application of k- Nearest Neighbour Classification in Medical Data Mining

机译:k最近邻分类法在医学数据挖掘中的应用

获取原文
获取外文期刊封面目录资料

摘要

Medical data is an ever-growing source of information from hospitals in form of patient records. When mined, the information hidden in these records is a huge resource bank for medical research. This data contains hidden patterns and relationships, which can lead to better diagnosis. Unfortunately, discovery of these patterns and relationships often goes unexploited. Studies have been carried out in medical diagnosis to predict heart diseases, lungs diseases, and various tumors based on the past data collected from patients. However, they are mostly limited to domain-specific systems that predict diseases restricted to their area of operations. In retrospect, the performance of the k-nearest neighborhoods (k-NN) classifier is highly dependent on the distance metric used to identify the k nearest neighbors of the query points. The standard Euclidean distance is commonly used in practice. This study uses vast storage of information so that diagnosis based on historical data can be made. It focuses on computing the probability of occurrence of a particular ailment by using a unique algorithm. This k-NN algorithm increases the accuracy of such diagnosis. The algorithm can be used to enhance the automated diagnoses, which include diagnosis of multiple diseases showing similar symptoms. To validate the experimental results, a hypothesis was tested for the following variables: accidents, age, allergies, blood pressure, smoking habit, total cholesterol, diabetes and hypertension, family history of heart disease, obesity, and lack of physical activity. It was evident that there was a strong relationship between the above variables to the causes of common chronic diseases like: heart ailment, diabetes and cancer.
机译:医疗数据是医院不断增加的以病历形式提供的信息来源。开采时,隐藏在这些记录中的信息是用于医学研究的巨大资源库。此数据包含隐藏的模式和关系,可以导致更好的诊断。不幸的是,这些模式和关系的发现常常得不到利用。已经基于从患者收集的过去数据在医学诊断中进行了研究以预测心脏病,肺部疾病和各种肿瘤。但是,它们大多限于特定领域的系统,这些系统可以预测局限于其手术区域的疾病。回想起来,第k个最近邻(k-NN)分类器的性能高度依赖于用于确定查询点的第k个最近邻居的距离度量。在实践中通常使用标准的欧几里得距离。这项研究使用了大量的信息存储,因此可以基于历史数据进行诊断。它着重于通过使用独特的算法来计算发生特定疾病的概率。这种k-NN算法提高了这种诊断的准确性。该算法可用于增强自动诊断,包括对显示相似症状的多种疾病进行诊断。为了验证实验结果,对以下变量进行了假设检验:事故,年龄,过敏,血压,吸烟习惯,总胆固醇,糖尿病和高血压,心脏病家族史,肥胖症和缺乏体育锻炼。显然,上述变量与诸如心脏病,糖尿病和癌症等常见慢性疾病的原因之间存在很强的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号