首页> 外文会议>International Conference on Software Engineering >Normalizing source code vocabulary to support program comprehension and software quality
【24h】

Normalizing source code vocabulary to support program comprehension and software quality

机译:标准化源代码词汇以支持程序理解和软件质量

获取原文
获取外文期刊封面目录资料

摘要

The literature reports that source code lexicon plays a paramount role in program comprehension, especially when software documentation is scarce, outdated or simply not available. In source code, a significant proportion of vocabulary can be either acronyms and-or abbreviations or concatenation of terms that can not be identified using consistent mechanisms such as naming conventions. It is, therefore, essential to disambiguate concepts conveyed by identifiers to support program comprehension and reap the full benefit of Information Retrieval-based techniques (e.g., feature location and traceability) whose linguistic information (i.e., source code identifiers and comments) used across all software artifacts (e.g., requirements, design, change requests, tests, and source code) must be consistent. To this aim, we propose source code vocabulary normalization approaches that exploit contextual information to align the vocabulary found in the source code with that found in other software artifacts. We were inspired in the choice of context levels by prior works and by our findings. Normalization consists of two tasks: splitting and expansion of source code identifiers. We also investigate the effect of source code vocabulary normalization approaches on software maintenance tasks. Results of our evaluation show that our contextual-aware techniques are accurate and efficient in terms of computation time than state of the art alternatives. In addition, our findings reveal that feature location techniques can benefit from vocabulary normalization when no dynamic information is available.
机译:文献报道源代码词典在程序理解中起着至关重要的作用,尤其是在软件文档稀少,过时或根本不可用的情况下。在源代码中,词汇的很大一部分可能是缩写词和/或缩写,或者是使用诸如命名约定之类的一致机制无法识别的术语的串联。因此,有必要对标识符传达的概念进行歧义处理以支持程序理解,并充分利用基于信息检索的技术(例如,特征位置和可追溯性)的全部优势,这些技术的语言信息(即源代码标识符和注释)在所有软件工件(例如需求,设计,变更请求,测试和源代码)必须保持一致。为此,我们提出了源代码词汇规范化方法,该方法利用上下文信息将源代码中的词汇与其他软件工件中的词汇对齐。先前的工作和我们的发现启发了我们选择上下文级别的灵感。标准化包括两个任务:拆分和扩展源代码标识符。我们还研究了源代码词汇规范化方法对软件维护任务的影响。我们的评估结果表明,与最新技术相比,我们的上下文感知技术在计算时间方面更准确,更高效。此外,我们的发现表明,当没有动态信息可用时,特征定位技术可以从词汇规范化中受益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号