首页> 外文会议>IEEE International Conference on Software Maintenance and Evolution >Do Contexts Help in Phrase-Based, Statistical Source Code Migration?

【24h】

Do Contexts Help in Phrase-Based, Statistical Source Code Migration?

机译：在基于短语的统计源代码迁移中做上下文帮助吗？

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Prior research showed that to migrate Java code to C# by directly applying phrase-based statistical machine translation (SMT) on the lexemes of source code produces much semantically incorrect code. In this work, we conduct empirical studies on several open-source projects to investigate the use of well-defined semantics in programming languages to guide the translation process in SMT. We have investigated five types of features forming the contexts involving the (semantic) relations among code tokens including occurrence association among code tokens, data and control dependencies among program entities, visibility constraints of entities, and the consistency in declarations and accesses of variables, fields and methods. We use the Direct Maximum Entropy (DME) approach for feature integration. Our empirical results show that as individual features added to the baseline SMT model, token association and data dependencies contribute much with highest relative improvement in semantic correctness of up to 18.3% and 18.5%, respectively. The integration of three feature types (token association, data dependencies, and visibility) into the baseline model has highest relative improvement with up to 26.4% improvement in semantic correctness. Generally, 43.5-80.7% of the total translated methods are semantically correct. Our results show a good direction of using SMT with semantic features at different levels of abstraction to improve its accuracy.

机译：先前的研究表明，通过直接应用基于短语的统计机器翻译（SMT）在源代码的Lexemes上直接应用基于短语的统计机器翻译（SMT），将Java代码迁移到C＃。在这项工作中，我们对几个开源项目进行了实证研究，以调查在编程语言中使用明确的语义来指导SMT中的翻译过程。我们已经调查了五种类型的特征，形成了涉及代码令牌之间的（语义）关系的上下文，包括代码令牌，数据和控制依赖性之间的发生关联，实体的可见性约束以及变量的声明和访问的一致性和方法。我们使用直接最大熵（DME）方法进行功能集成。我们的经验结果表明，随着添加到基线SMT模型的单个功能，令牌协会和数据依赖性分别为语义正确性的最高相对改善程度高达18.3％和18.5％。将三种特征类型（令牌关联，数据依赖性和可见性的集成到基线模型中具有最高的相对改善，并且对语义正确性的提高高达26.4％。通常，总转化方法的43.5-80.7％是语义正确的。我们的结果显示使用SMT与语义特征的良好方向，不同级别的抽象，以提高其准确性。

著录项

来源
《IEEE International Conference on Software Maintenance and Evolution 》|2016年|1 v.|共11页
会议地点
作者
Anh Tuan Nguyen; Zhaopeng Tu; Tien N. Nguyen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术 ;
关键词
Semantics; Syntactics; Java; C; languages; Data models; Context;

机译：语义;语法;Java;C;语言;数据模型;背景;

相似文献

外文文献
中文文献
专利

1. Integrating source-language context into phrase-based statistical machine translation [J] . Rejwanul Haque, Sudip Kumar Naskar, Antal van den Bosch, Machine translation . 2011 ,第3期

机译：将源语言上下文集成到基于短语的统计机器翻译中
2. Young people's career choices in Swedish rural contexts: Schools' social codes, migration and resources [J] . Rosvall Per-Ake, Ronnlund Maria, Johansson Monica Journal of Rural Studies . 2018 ,第期

机译：年轻人在瑞典农村背景下的职业选择：学校的社会规范，移民和资源
3. CodeGRU: Context-aware deep learning with gated recurrent unit for source code modeling [J] . Hussain Yasir, Huang Zhiqiu, Zhou Yu, Information and software technology . 2020 ,第Sepa期

机译：Codgru：与源代码建模的Gated Recurrent单元的上下文感知深度学习
4. Do Contexts Help in Phrase-Based, Statistical Source Code Migration? [C] . Anh Tuan Nguyen, Zhaopeng Tu, Tien N. Nguyen IEEE International Conference on Software Maintenance and Evolution . 2016

机译：上下文是否有助于基于短语的统计源代码迁移？
5. A principled statistical analysis of discrete context-dependent neural coding. [D] . Huang, Yifei. 2010

机译：离散上下文相关神经编码的原则统计分析。
6. Estimating statistical uncertainty of Monte Carlo efficiency-gain in the context of a correlated sampling Monte Carlo code for brachytherapy treatment planning with non-normal dose distribution [O] . Nitai D Mukhopadhyay, Andrew J Sampson, Daniel Deniz, -1

机译：在非正常剂量分布的相关采样蒙特卡罗代码中估算Monte Carlo效率的统计不确定性
7. Supertags as source language context in hierarchical phrase-based SMT [O] . Haque Rejwanul, Kumar Naskar Sudip, van den Bosch Antal, 2010

机译：supertags作为基于分层短语的smT中的源语言上下文
8. Efficient Graph Search Decoder for Phrase-Based Statistical Machine. [R] . Delaney, B., Shen, W., Anderson, T. 2016

机译：基于短语的统计机器的高效图搜索解码器。

Do Contexts Help in Phrase-Based, Statistical Source Code Migration?

摘要

著录项

相似文献

相关主题

期刊订阅