首页> 外文OA文献 >Analysis of the HSEES Chemical Incident Database Using Data and Text Mining Methodologies
【2h】

Analysis of the HSEES Chemical Incident Database Using Data and Text Mining Methodologies

机译:使用数据和文本挖掘方法对HSEES化学事故数据库进行分析

摘要

Chemical incidents can be prevented or mitigated by improving safety performance and implementing the lessons learned from past incidents. Despite some limitations in the range of information they provide, chemical incident databases can be utilized as sources of lessons learned from incidents by evaluating patterns and relationships that exist between the data variables. Much of the previous research focused on studying the causal factors of incidents; hence, this research analyzes the chemical incidents from both the causal and consequence elements of the incidents. A subset of incidents data reported to the Hazardous Substance Emergency Events Surveillance (HSEES) chemical incident database from 2002-2006 was analyzed using data mining and text mining methodologies. Both methodologies were performed with the aid of STATISTICA software. The analysis studied 12,737 chemical process related incidents and extracted descriptions of incidents in free-text data format from 3,316 incident reports. The structured data was analyzed using data mining tools such as classification and regression trees, association rules, and cluster analysis. The unstructured data (textual data) was transformed into structured data using text mining, and subsequently analyzed further using data mining tools such as, feature selections and cluster analysis. The data mining analysis demonstrated that this technique can be used in estimating the incident severity based on input variables of release quantity and distance between victims and source of release. Using the subset data of ammonia release, the classification and regression tree produced 23 final nodes. Each of the final nodes corresponded to a range of release quantity and, of distance between victims and source of release. For each node, the severity of injury was estimated from the observed severity scores' average. The association rule identified the conditional probability for incidents involving piping, chlorine, ammonia, and benzene in the value of 0.19, 0.04, 0.12, and 0.04 respectively. The text mining was utilized successfully to generate elements of incidents that can be used in developing incident scenarios. Also, the research has identified information gaps in the HSEES database that can be improved to enhance future data analysis. The findings from data mining and text mining should then be used to modify or revise design, operation, emergency response planning or other management strategies.
机译:通过提高安全绩效和实施从过去事故中汲取的教训,可以预防或减轻化学事故。尽管它们提供的信息范围有所限制,但化学事故数据库可以通过评估数据变量之间存在的模式和关系而用作从事故中学到的教训。先前的许多研究都集中在研究事件的因果关系上。因此,本研究从事故的因果和后果两个方面分析了化学事故。使用数据挖掘和文本挖掘方法对2002-2006年报告给有害物质紧急事件监视(HSEES)化学事故数据库的一部分事故数据进行了分析。两种方法均借助STATISTICA软件执行。分析研究了12,737起与化学过程相关的事故,并从3,316起事故报告中以自由文本数据格式提取了事故描述。使用数据挖掘工具(例如分类和回归树,关联规则和聚类分析)分析了结构化数据。使用文本挖掘将非结构化数据(文本数据)转换为结构化数据,然后使用数据挖掘工具(例如特征选择和聚类分析)进一步进行分析。数据挖掘分析表明,该技术可用于基于释放量的输入变量以及受害者与释放源之间的距离来估计事件的严重性。使用氨释放的子集数据,分类和回归树生成了23个最终节点。每个最终节点对应一个释放量范围,以及受害者与释放源之间的距离。对于每个节点,根据观察到的严重性得分的平均值来估计损伤的严重性。关联规则确定涉及管道,氯,氨和苯的事件的条件概率分别为0.19、0.04、0.12和0.04。文本挖掘已成功地用于生成事件元素,可用于开发事件场景。此外,研究还确定了HSEES数据库中的信息缺口,可以改进这些缺口以增强将来的数据分析。然后,应使用数据挖掘和文本挖掘中的发现来修改或修订设计,操作,应急响应计划或其他管理策略。

著录项

  • 作者

    Mahdiyati -;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 en_US
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号