首页> 外文OA文献 >Semantic deontic modeling and text classification for supporting automated environmental compliance checking in construction
【2h】

Semantic deontic modeling and text classification for supporting automated environmental compliance checking in construction

机译:用于支持施工中自动环境合规性检查的语义道义建模和文本分类

摘要

Compliance checking in the construction industry refers to checking the conformance of a process, plan, document, design, or action to applicable norms (regulatory norms, contractual norms, and advisory practices). Manual compliance checking has been time-intensive, resource-consuming, and error-prone. Automated compliance checking (ACC) is thus a more efficient approach to compliance assessment. However, automated compliance checking (ACC) in the construction domain continues to be a challenge. Current ACC systems do not provide the level of knowledge representation and reasoning that is needed to efficiently interpret applicable norms (laws, regulations, contractual requirements, advisory practices, etc.) and check conformance of designs and operations to those interpretations. As such, this thesis explores a new approach to automated regulatory and contractual compliance checking – applying theoretical and computational developments in the fields of deontology, deontic logic, and natural language processing (NLP) to the problem of compliance checking in construction. Deontology is a theory of rights and obligations; and deontic logic is a branch of modal logic that deals with obligations, permissions, etc. A deontology for ACC would serve as a normative model for ACC knowledge representation and reasoning. NLP is a theoretically-based computerized approach to analyzing, representing, and manipulating natural language text for the purpose of achieving human-like language processing for a range of tasks or applications. NLP is the process by which humans and computers interact using natural human language (e.g. English). It is particularly important in ACC, as all norms are documented in natural language text. As such NLP is needed to: 1) classify and retrieve applicable norms/information from large amounts of textual documents using text classification algorithms, and 2) extract and formalize natural language rules or project information expressed in textual documents using information extraction techniques. The first thesis objective is to develop an upper-level domain deontology for ACC in construction. The purpose of the deontology is to represent the laws and regulations and reason about compliance of construction operations to those laws and regulations. A deontic model deals with assessing whether a specific action or state is right or wrong, permitted or forbidden. It uses deontic logic for normative reasoning about ideal versus actual behavior or state of systems; such as formal contract representation, automated contractual analysis, violation assessment systems, etc. This deontology represents the first deontic modeling initiative in the construction domain. The deontology is composed of: 1) concepts of ACC in the construction domain, such as ‘compliance assessor’, ‘compliance agent’, ‘subject’, ‘authority’ ‘compliance checking result’, etc.; 2) inter-concept relationships, such as ‘compliance assessor assesses compliance agent’; 3) axioms, which specify the definitions of the concepts and relations in the deontology, and specify constraints on their interpretation. Axioms also represent the constraints of the ACC domain. The second thesis objective is to evaluate the deontology and demonstrate its application using real project case studies. The initial evaluation of the deontic model showed its potential in successfully addressing the needs of ACC in construction. The model was initially evaluated through: 1) answering formal competency questions that evaluate the ability of the deontology to fulfill its requirements – as set by the developer, 2) automated consistency checking that evaluates the consistency of the components of the model (concepts, relations, and axioms), 3) case studies that evaluate the applicability of the deontology to solve real project compliance checking problems (the case studies focused on environmental compliance checking, and specifically on checking storm-water pollution prevention plans with applicable norms), and 4) domain expert interviews that evaluate the deontic model from a user’s perspective.The third research objective is to develop a semantic (deontic-based) TC algorithm to classify the clauses/sub-clauses of contract general conditions as environmental and non-environmental (since the second objective focuses on environmental compliance checking). Text classification is the process of identifying the group to which a piece of text belongs. Different text classification methods such as naïve Bayes classifier (NB), support vector machines (SVM), and maximum entropy (ME), were studied and empirically evaluated in the context of construction contract text classification. Different preprocessing and feature selection methods were implemented and evaluated (in terms of recall and precision). The final classifier model implements the ‘bag of words’ feature model, stop-word removal using a standard English stop-word list, stemming, odds ratio scoring function, best 20 features, feature weighting using term frequency, and SVM algorithm for machine learning and classification. The performance of the model achieves a 100% recall and 96% precision, at 26% threshold.
机译:建筑行业的合规性检查是指检查过程,计划,文件,设计或行动是否符合适用规范(法规,合同规范和咨询实践)。手动合规性检查非常耗时,耗资源且容易出错。因此,自动合规性检查(ACC)是一种更有效的合规性评估方法。但是,建筑领域的自动化合规性检查(ACC)仍然是一个挑战。当前的ACC系统没有提供有效解释适用规范(法律,法规,合同要求,咨询实践等)以及检查设计和操作与这些解释的一致性所需的知识表示和推理水平。因此,本文探索了一种新的自动监管和合同合规性检查方法–将在本体论,宗法逻辑和自然语言处理(NLP)领域中的理论和计算发展应用于建筑中的合规性检查问题。道义论是一种权利和义务的理论。宗法逻辑是模态逻辑的一个分支,用于处理义务,权限等。ACC的宗法学将用作ACC知识表示和推理的规范模型。 NLP是一种基于理论的计算机化方法,用于分析,表示和处理自然语言文本,以实现针对各种任务或应用程序的类似于人的语言处理。 NLP是人类和计算机使用自然人类语言(例如英语)进行交互的过程。这一点在ACC中尤为重要,因为所有规范都以自然语言文本记录在案。因此,需要NLP:1)使用文本分类算法从大量文本文档中分类和检索适用的规范/信息,以及2)使用信息提取技术提取并正式化文本文档中表达的自然语言规则或项目信息。论文的第一个目标是开发ACC在建设中的高级领域本体。本体论的目的是代表法律和法规以及构建操作符合这些法律和法规的原因。宗法模型处理评估特定行为或状态是对还是错,允许还是禁止的行为。它使用灵性逻辑进行关于理想与实际行为或系统状态的规范性推理。例如正式合同表示,自动合同分析,违规评估系统等。这种本体论是建筑领域中第一个本体论建模倡议。本体论包括:1)在建筑领域中的ACC概念,例如“合规评估者”,“合规代理人”,“主体”,“权威”,“合规检查结果”等; 2)概念间的关系,例如“合规评估师评估合规代理人”; 3)公理,它指定了本体论中概念和关系的定义,并指定了对其解释的约束。公理也代表ACC域的约束。论文的第二个目标是通过实际的项目案例研究来评估本体论并演示其应用。道义模型的初步评估显示了它在成功解决建筑中ACC需求方面的潜力。最初通过以下方式评估模型:1)回答评估能力要求满足开发人员要求的能力的正式能力问题; 2)自动一致性检查,评估模型组成部分(概念,关系)的一致性和公理),3)评估本体论解决实际项目合规性检查问题的适用性的案例研究(该案例研究侧重于环境合规性检查,特别是按适用的规范检查雨水污染预防计划),以及4 )领域专家访谈,从用户的角度评估Deontic模型。第三个研究目标是开发一种语义(基于Deontic的)TC算法,以将合同一般条件的条款/子条款分为环境和非环境(自第二个目标侧重于环境合规性检查)。文本分类是识别一段文本所属的组的过程。研究了不同的文本分类方法,例如朴素贝叶斯分类器(NB),支持向量机(SVM)和最大熵(ME),并在施工合同文本分类的背景下进行了经验评估。实施和评估了不同的预处理和特征选择方法(在召回率和精度方面)。最终的分类器模型实现了“单词袋”功能模型,使用标准英语停用词列表的停用词删除,词干,优势比评分功能,最佳20种功能,使用词频的特征加权以及用于机器学习和分类的SVM算法。该模型的性能在26%的阈值下实现了100%的召回率和96%的精度。

著录项

  • 作者

    Abdelmoneim Dareen;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号