Empirical Principles and an Industrial Case Study in Retrieving Equivalent Requirements via Natural Language Processing Techniques

Falessi Davide; Cantone Giovanni; Canfora Gerardo

首页> 外文期刊>Software Engineering, IEEE Transactions on >Empirical Principles and an Industrial Case Study in Retrieving Equivalent Requirements via Natural Language Processing Techniques

【24h】

Empirical Principles and an Industrial Case Study in Retrieving Equivalent Requirements via Natural Language Processing Techniques

机译：通过自然语言处理技术检索等效需求的经验原理和工业案例研究

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Though very important in software engineering, linking artifacts of the same type (clone detection) or different types (traceability recovery) is extremely tedious, error-prone, and effort-intensive. Past research focused on supporting analysts with techniques based on Natural Language Processing (NLP) to identify candidate links. Because many NLP techniques exist and their performance varies according to context, it is crucial to define and use reliable evaluation procedures. The aim of this paper is to propose a set of seven principles for evaluating the performance of NLP techniques in identifying equivalent requirements. In this paper, we conjecture, and verify, that NLP techniques perform on a given dataset according to both ability and the odds of identifying equivalent requirements correctly. For instance, when the odds of identifying equivalent requirements are very high, then it is reasonable to expect that NLP techniques will result in good performance. Our key idea is to measure this random factor of the specific dataset(s) in use and then adjust the observed performance accordingly. To support the application of the principles we report their practical application to a case study that evaluates the performance of a large number of NLP techniques for identifying equivalent requirements in the context of an Italian company in the defense and aerospace domain. The current application context is the evaluation of NLP techniques to identify equivalent requirements. However, most of the proposed principles seem applicable to evaluating any estimation technique aimed at supporting a binary decision (e.g., equivalentonequivalent), with the estimate in the range [0,1] (e.g., the similarity provided by the NLP), when the dataset(s) is used as a benchmark (i.e., testbed), independently of the type of estimator (i.e., requirements text) and of the estimation method (e.g., NLP).

机译：尽管在软件工程中非常重要，但是链接相同类型（克隆检测）或不同类型（可追溯性恢复）的工件非常繁琐，容易出错且费力。过去的研究重点是通过基于自然语言处理（NLP）的技术为分析师提供支持，以识别候选链接。由于存在许多NLP技术，并且其性能会根据环境而变化，因此定义和使用可靠的评估程序至关重要。本文的目的是提出一套七项原则，用于评估NLP技术在确定等效需求方面的性能。在本文中，我们推测并验证了NLP技术是否能够根据能力和正确识别等效需求的几率对给定的数据集执行操作。例如，当确定等效要求的几率很高时，可以合理地预期NLP技术将导致良好的性能。我们的关键思想是测量所使用的特定数据集的随机因素，然后相应地调整观察到的性能。为了支持这些原则的应用，我们将其实际应用报告给一个案例研究，该案例评估了许多NLP技术的性能，以识别意大利公司在国防和航空航天领域的等效要求。当前的应用环境是对NLP技术的评估，以确定等同的需求。但是，大多数提出的原理似乎适用于评估旨在支持二进制决策（例如，等效/非等效）的任何估计技术，其估计范围为[0,1]（例如，NLP提供的相似性），当数据集用作基准（即测试床）时，与估计器的类型（即需求文本）和估计方法（例如NLP）无关。

著录项

来源
《Software Engineering, IEEE Transactions on》 |2013年第1期|p.18-44|共27页
作者
Falessi Davide; Cantone Giovanni; Canfora Gerardo;
展开▼
作者单位

Simula Research Laboratory, Lysaker and University of Rome "TorVergata", Rome;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Context; Matrix decomposition; Measurement; Monitoring; Natural language processing; Semantics; Thesauri; Empirical software engineering; equivalent requirements; metrics and measurement; natural language processing; traceability recovery;

机译：上下文;矩阵分解;测量;监控;自然语言处理;语义学叙词表;实证软件工程;等效要求;度量和度量;自然语言处理;可追溯性恢复;

相似文献

外文文献
中文文献
专利

1. Retrieving similar cases for construction project risk management using Natural Language Processing techniques [J] . Zou Yang, Kiviniemi Arto, Jones Stephen W. Automation in construction . 2017,第Auga期

机译：使用自然语言处理技术检索类似案例以进行建设项目风险管理
2. Case studies on using natural language processing techniques in customer relationship management software [J] . Ozan Sukru Journal of Intelligent Information Systems . 2021,第2期

机译：在客户关系管理软件中使用自然语言处理技术的案例研究
3. Novel application of natural language processing and machine learning techniques to analyze qualitative patient-reported outcomes data: a report from the PEPR pediatric cancer survivorship study [J] . Lu Zhaohua, Baker Justin, Krull Kevin, Quality of life research: An international journal of quality of life aspects of treatment, care and rehabilitation . 2019,第Suppla1期

机译：新颖的自然语言处理和机器学习技术分析定性患者报告的结果数据：PEPR儿科癌症生存研究的报告
4. Multi-Class Categorization of Design-Build Contract Requirements Using Text Mining and Natural Language Processing Techniques [C] . Fahad ul Hassan, Tuyen Le, Duc-Hoc Tran Construction Research Congress . 2020

机译：使用文本挖掘和自然语言处理技术进行设计 - 建立合同要求的多级分类
5. Essays in empirical industrial organization using time series techniques: Applications in natural resource markets. [D] . Fell, Harrison G. 2007

机译：使用时间序列技术的经验性行业组织中的论文：在自然资源市场中的应用。
6. Applying natural language processing and machine learning techniques to patient experience feedback: a systematic review [O] . Mustafa Khanbhai, Patrick Anyadi, Joshua Symons, 2021

机译：将自然语言处理和机器学习技术应用于患者体验反馈：系统审查
7. Prediction of emergency department resource requirements during triage: An application of current natural language processing techniques [O] . Nicholas W. Sterling, Felix Brann, Rachel E. Patzer, 2020

机译：分类期间应急部资源要求的预测：目前自然语言处理技术的应用

Empirical Principles and an Industrial Case Study in Retrieving Equivalent Requirements via Natural Language Processing Techniques

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅