首页> 外文会议>International Conference on Database Systems for Advanced Applications >Code2Text: Dual Attention Syntax Annotation Networks for Structure-Aware Code Translation
【24h】

Code2Text: Dual Attention Syntax Annotation Networks for Structure-Aware Code Translation

机译:Code2Text:用于结构感知代码翻译的双重注意语法注释网络

获取原文

摘要

Translating source code into natural language text helps people understand the computer program better and faster. Previous code translation methods mainly exploit human specified syntax rules. Since handcrafted syntax rules are expensive to obtain and not always available, a PL-independent automatic code translation method is much more desired. However, existing sequence translation methods generally regard source text as a plain sequence, which is not competent to capture the rich hierarchical characteristics inherently reside in the code. In this work, we exploit the abstract syntax tree (AST) that summarizes the hierarchical information of a code snippet to build a structure-aware code translation method. We propose a syntax annotation network called Code2Text to incorporate both source code and its AST into the translation. Our Code2Text features the dual encoders for the sequential input (code) and the structural input (AST) respectively. We also propose a novel dual-attention mechanism to guide the decoding process by accurately aligning the output words with both the tokens in the source code and the nodes in the AST. Experiments on a public collection of Python code demonstrate that Code2Text achieves better performance compared to several state-of-the-art methods, and the generation of Code2Text is accurate and human-readable.
机译:将源代码转换为自然语言文本有助于人们更好地了解计算机程序。以前的代码翻译方法主要利用人类指定的语法规则。由于手工操作的语法规则昂贵并且不始终可用,因此可以更需要一个独立的自动代码转换方法。然而,现有的序列翻译方法通常将源文本视为纯粹的序列,这不受捕获富分层特征的富有能力,固有地驻留在代码中。在这项工作中,我们利用抽象语法树(AST)来总结代码片段的分层信息来构建结构感知代码转换方法。我们提出了一个称为Code2Text的语法注释网络,将源代码及其AST结合到翻译中。我们的Code2Text分别具有分别为顺序输入(代码)和结构输入(AST)的双编码器。我们还提出了一种新的双重注意机制,通过将输出单词与源代码中的令牌和AST中的节点进行准确地对准输出单词来引导解码过程。与若干最先进的方法相比,对Python代码公共收集的实验表明Code2Text实现了更好的性能,并且代码件的生成是准确和人类可读的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号