首页> 外文会议>International Conference on Data Mining >Use of graph theory for data mining in public health
【24h】

Use of graph theory for data mining in public health

机译:用图论对公共卫生数据挖掘的研究

获取原文

摘要

Data mining problems are common in public health, for example for identifying disease clusters and multidimensional patterns within large databases, e.g., socioeconomic differentials in health. Although numerous data mining methods have been developed, currently available methods are not designed to handle complex pattern searching queries and no satisfactory methods are available for this purpose. The aim of the study reported here was to test graph-theoretical methods for data mining in public health databases to identify areas of high deprivation that are surrounded by affluent areas and deprived areas surrounded by deprived areas. Graph-theory (using the maximum common subgraph isomorphism (mcs) method) was used to search a database containing information on the 10920 enumeration districts (EDs) for the Trent Region of England. Each ED was allocated to a deprivation quintile based on the Townsend Deprivation Score. These mcs program was used to identify deprived EDs that are adjacent to deprived EDs and deprived EDs that are adjacent to affluent EDs. The mcs program identified 1528 deprived EDs adjacent to at least two deprived EDs, 1181 deprived EDs adjacent to at least three deprived EDs, 802 deprived EDs adjacent to at least four deprived EDs, and 505 deprived EDs adjacent to at least five deprived EDs. The program successfully identified 147 deprived EDs adjacent to at least two affluent EDs, 54 deprived EDs adjacent to at least three affluent EDs, 14 deprived EDs adjacent to at least four affluent EDs, and six deprived EDs adjacent to at least five affluent EDs. The retrieved EDs were then used for hypothesis testing using statistical methods. The study demonstrates the potential of graph theoretical techniques for data mining in public health databases.
机译:数据挖掘问题在公共健康中是常见的,例如用于识别大型数据库内的疾病集群和多维模式,例如健康的社会经济差异。尽管已经开发了许多数据挖掘方法,但目前的可用方法不设计用于处理复杂的模式搜索查询,并且没有为此目的提供令人满意的方法。本研究的目的在这里报告是测试公共卫生数据库中数据挖掘的图形理论方法,以确定被贫困地区包围的富裕地区包围的高剥夺领域。图形理论(使用最大常见的子图同样(MCS)方法)用于搜索包含英格兰特伦特地区的10920枚举地区(EDS)的数据库。根据Townsend剥夺评分分配给剥夺Quintile的每个ED。这些MCS程序用于识别与贫困的EDS邻近的剥夺EDS,并剥夺了与富裕EDS相邻的EDS。 MCS计划确定了与至少两种被剥夺的EDS相邻的1528年贫困的ED,与至少三个被剥夺的EDS相邻的1181年贫困的EDS,与至少四个被剥夺的EDS相邻的802个贫困的EDS,以及至少五个剥夺的EDS毗邻的505名剥夺的EDS。该计划成功地鉴定了至少两个富含富态EDS的147个贫困的ED,与至少三种富常见的EDS邻近至少四个富含4个富抗EDS的贫困EDS,以及至少五个富含富氮的六种剥夺的EDS,贫困的EDS居中贫困的EDS。然后使用统计方法使用检索到的EDS用于假设测试。该研究展示了公共健康数据库数据挖掘的图表理论技术的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号