首页> 外文期刊>Automated software engineering >Concept extraction from business documents for software engineering projects
【24h】

Concept extraction from business documents for software engineering projects

机译:从软件工程项目的业务文档中提取概念

获取原文
获取原文并翻译 | 示例
           

摘要

Acquiring relevant business concepts is a crucial first step for any software project for which the software experts are not domain experts. The wealth of information buried within an organization's written documentation is a precious source of concepts, relationships and attributes which can be used to model the enterprise's domain. The lack of targeted extraction tools can make perusing through this type of resource a lengthy and costly process. We propose a domain model focused extraction process aimed at the rapid discovery of knowledge relevant to the software expert. To avoid undesirable noise from high-level linguistic tools, the process is mainly composed of positive and negative base filters that are less error prone and more robust. The extracted candidates are then reordered using a weight propagation algorithm based on structural hints from source documents. When tested on French text corpora from public organizations, our process performs 2.7 times better than a statistical baseline for relevant concept discovery. A new metric to assess the performance discovery speed of relevant concepts is introduced. The annotation of a gold standard definition of software engineering oriented concepts for knowledge extraction tasks is also presented.
机译:对于任何软件专家不是领域专家的软件项目,获得相关的业务概念是至关重要的第一步。组织书面文档中埋藏的大量信息是概念,关系和属性的宝贵来源,可用于对企业领域进行建模。缺乏有针对性的提取工具会使通过这种类型的资源进行仔细的研究成为一个漫长而昂贵的过程。我们提出了一种针对领域模型的提取过程,旨在快速发现与软件专家相关的知识。为了避免来自高级语言工具的不希望有的噪声,该过程主要由正负基滤波器组成,这些正负滤波器不易出错,并且更健壮。然后,基于源文档中的结构提示,使用权重传播算法对提取的候选对象进行重新排序。在公共组织的法语文本语料库上进行测试时,我们的过程比相关概念发现的统计基线要好2.7倍。引入了一种新的指标来评估相关概念的性能发现速度。还介绍了针对知识提取任务的面向软件工程的黄金标准定义的注释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号