首页> 外文会议>IEEE/ACM International Conference on Automated Software Engineering >Patching as Translation: the Data and the Metaphor
【24h】

Patching as Translation: the Data and the Metaphor

机译:修补作为翻译:数据和隐喻

获取原文

摘要

Machine Learning models from other fields, like Computational Linguistics, have been transplanted to Software Engineering tasks, often quite successfully. Yet a transplanted model's initial success at a given task does not necessarily mean it is well-suited for the task. In this work, we examine a common example of this phenomenon: the conceit that software patching is like language translation. We demonstrate empirically that there are subtle, but critical distinctions between sequence-to-sequence models and translation model: while program repair benefits greatly from the former, general modeling architecture, it actually suffers from design decisions built into the latter, both in terms of translation accuracy and diversity. Given these findings, we demonstrate how a more principled approach to model design, based on our empirical findings and general knowledge of software development, can lead to better solutions. Our findings also lend strong support to the recent trend towards synthesizing edits of code conditional on the buggy context, to repair bugs. We implement such models ourselves as “proof-of-concept” tools and empirically confirm that they behave in a fundamentally different, more effective way than the studied translation-based architectures. Overall, our results demonstrate the merit of studying the intricacies of machine learned models in software engineering: not only can this help elucidate potential issues that may be overshadowed by increases in accuracy; it can also help innovate on these models to raise the state-of-the-art further. We will publicly release our replication data and materials at https://github.com/ARiSE-Lab/Patch-as-translation.
机译:从其他领域的机器学习模型,如计算语言学,经常相当成功地移植到软件工程任务。然而,移植的模型在给定任务的初步成功并不一定意味着它非常适合任务。在这项工作中,我们研究了这种现象的常见例子:软件修补的识别就像语言翻译一样。我们展示了序列到序列模型和翻译模型之间存在微妙,但致命的区别:虽然从前一般的建模架构的程序维修效益大大,但它实际上遭受了后者内置的设计决策,无论是翻译准确性和多样性。鉴于这些调查结果,我们展示了模型设计的一种更为原则的方法,根据我们的实证调查和软件开发的一般知识,可以导致更好的解决方案。我们的调查结果还支持近期对跨越错误背景下的代码条件编辑的趋势,修复错误。我们将这样的模型作为“概念验证”工具,并经验证实它们以基本上不同,更有效的方式表现得比学习的基于译的架构。总体而言,我们的结果表明了研究软件工程中的机器学习模型的复杂的优点:这不仅可以帮助阐明可能因准确性而被遮挡的潜在问题;它还可以帮助创新这些模型进一步提高最先进的。我们将在Https://github.com/arise-lab/patch-as-translation上公开发布我们的复制数据和材料。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号