The Expansion of Source Code Abbreviations Using a Language Model

机译：使用语言模型扩展源代码缩写

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Programmers often abbreviate identifiers names in source code to represent single words, i.e. unigrams, or phrases, i.e. multigrams. However, the difficulty to retrieve the original word(s) of an abbreviation during the maintenance phase makes the source code more problematic to comprehend. Incorrect abbreviations expansion may lead to introducing defects in the code. There are many approaches that that automatically expand abbreviations to their original words, unfortunately, they are based on predefined patterns and single-words dictionaries which cannot address abbreviations that are expandable to phrases. In this paper, we describe a bigram-based inference model which utilizes unigrams statistical properties as evidence to retrieve the original word automatically. We evaluated our approach on a set of 100 abbreviations randomly picked from eight open source projects and found that our approach correctly expands 78% of the set.

机译：程序员通常在源代码中缩写标识符名称，以表示单个单词（即，单字组）或短语（即，多字组）。但是，在维护阶段很难检索缩写词的原始单词，这使源代码更难以理解。不正确的缩写词扩展可能会导致在代码中引入缺陷。有很多方法可以将缩写词自动扩展到其原始单词，但是不幸的是，它们基于预定义的模式和单个单词词典，无法解决可扩展到短语的缩写词。在本文中，我们描述了一个基于双链推理的推理模型，该模型利用单字组统计属性作为证据来自动检索原始单词。我们对从八个开源项目中随机选择的100个缩写词进行了评估，发现我们的方法正确地扩展了该集合的78％。

著录项

来源
《IEEE Annual Computer Software and Applications Conference》|2018年|370-375|共6页
会议地点
作者
Abdulrahman Alatawi; Weifeng Xu; Jie Yan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Mathematical model; Dictionaries; Software; Computer science; Bayes methods; Conferences;

机译：数学模型;词典;软件;计算机科学;贝叶斯方法;会议;

相似文献

外文文献
中文文献
专利

1. Language model-based automatic prefix abbreviation expansion method for biomedical big data analysis [J] . Xiaokun Du, Rongbo Zhu, Yanhong Li, Future generation computer systems . 2019,第Sepa期

机译：基于语言模型的生物医学大数据分析自动前缀缩写扩展方法
2. An approach to source code conversion of classical programming languages into source code of quantum programming languages [J] . Alan Boji? Journal of Information and Organizational Sciences . 2014,第2期

机译：一种将经典编程语言的源代码转换为量子编程语言的源代码的方法
3. Improve Language Modeling for Code Completion Through Learning General Token Repetition of Source Code with Optimized Memory [J] . Yixiao Yang, Xiang Chen, Jiaguang Sun International journal of software engineering and knowledge engineering . 2019,第11a12期

机译：通过学习具有优化内存的源代码的通用令牌重复来改进用于代码完成的语言建模
4. The Expansion of Source Code Abbreviations Using a Language Model [C] . Abdulrahman Alatawi, Weifeng Xu, Jie Yan IEEE Annual Computer Software and Applications Conference . 2018

机译：使用语言模型扩展源代码缩写
5. A Probabilistic-Based Approach for Expanding Abbreviations in Source Code [D] . Alatawi, Abdulrahman M. 2018

机译：基于概率的源代码扩展缩写方法
6. Language model-based automatic prefix abbreviation expansion method for biomedical big data analysis [O] . Xiaokun Du, Rongbo Zhu, Yanhong Li, -1

机译：基于语言模型的生物医学大数据分析自动前缀缩写扩展方法
7. 3 INTRODUCTION >3.1 The legacy of Johannes Bobrowski >3.2 The manuscript, use of the Gothic script, alphabetical sequence, sources and content of the OP vocabulary >3.3 The relationship between Bobrowski's OP Vocabulary and his Lithuanian and OP themes >3.4 History of the Old Prussians >3.5 Culture and social status of the Old Prussians >3.6 Language and literary sources of the Old Prussians>3.6.1 The so-called Elbing dictionary (E) > 3.6.1.1 History of the E glossary >3.6.1.2 Editions of E (and other OP glossaries) >3.6.1.3 The content of E in Bobrowski's PV >3.6.1.4 Place of writing and the dialect of the Elbing Vocabulary > 3.6.2 Simon Grunau's Prussian Vocabulary >3.6.3 The Catechisms >3.6.4 Disparate lexical items >4 METHODOLOGY FOR THE ANALYSIS OF PV >4.1 Method of annotation > 5. LITHUANIAN AND OLD PRUSSIAN THEMES >6 PRUZZISCHE VOKABELN AND CLASSIFICATION >7 SUMMARY AND CONCLUSION >7.1 Principle of selectivity > 7.1.1 The sphere of the human being >7.1.2 Abstract terms and concepts >7.1.3 Grammatical items >8 A GUIDE TO USING THE OPG > 8.1 Divisions of OPG > 8.2 Guide to Citations > 8.3 Abbbreviations / acronyms of Frequently Used Terms Symbols (cf. Select Bibliography) >8.4 Languages and grammatical terms (deviating LBV and other abbreviations are in brackets) [O] . 2010

机译：3简介> 3.1 Johannes Bobrowski的遗产 > 3.2稿件，使用哥特式脚本，按字母顺序，opobulary的字母序列，源和内容 > 3.3 Bobrowski之间的关系op词汇和他的立陶宛语和op主题 > 3.4旧普鲁士的历史 > 3.5旧普鲁士的文化和社会地位 > 3.6语言和文学来源旧普鲁士 > 3.6.1所谓的阐述字典（e） > 3.6.1.1 e术语词汇表的历史 > 3.6.1.2 E（和其他op词汇表的版本） > 3.6.1.3 Bobrowski的PV中的e内容 > 3.6.1.4写字地点和梳理词汇 > 3.6.2 Simon Grunau的普鲁士词汇 > 3.6.3 Tavechisms > 3.6.4不同词汇项目 > 4分析 pv > 4.1注释方法 > 5.立陶宛和旧普鲁士主题 > 6 pruzzische vokabeln 和分类阳离子 > 7概述和结论 > 7.1选择性原理 > 7.1.1人类球体 7.1.2摘要术语和概念 > 7.1.3语法项目 > 8使用 opg > 8.1的差分 opg > 8.2引文引导指南 > 8.3 abbbreviations /常用条款和符号的首字母缩略词（CF.选择参考书目） > 8.4语言和语法术语（偏离 lbv 和其他缩写在括号中）

The Expansion of Source Code Abbreviations Using a Language Model

摘要

著录项

相似文献

相关主题

期刊订阅