首页> 外文会议>2013 1st International Workshop on Data Analysis Patterns in Software Engineering >Building Statistical Language Models of code

【24h】

Building Statistical Language Models of code

机译：建立代码的统计语言模型

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present the Source Code Statistical Language Model data analysis pattern. Statistical language models have been an enabling tool for a wide array of important language technologies. Speech recognition, machine translation, and document summarization (to name a few) all rely on statistical language models to assign probability estimates to natural language utterances or sentences. In this data analysis pattern, we describe the process of building n-gram language models over software source files. We hope that by introducing the empirical software engineering community to best practices that have been established over the years in research for natural languages, statistical language models can become a tool that SE researchers are able to use to explore new research directions.

机译：我们介绍了源代码统计语言模型数据分析模式。统计语言模型已成为众多重要语言技术的支持工具。语音识别，机器翻译和文档摘要（仅举几例）都依赖于统计语言模型来将概率估计分配给自然语言或句子。在这种数据分析模式中，我们描述了在软件源文件上构建n-gram语言模型的过程。我们希望通过向经验软件工程界介绍多年来在自然语言研究中建立的最佳实践，统计语言模型可以成为SE研究人员用来探索新研究方向的工具。

著录项

来源
《2013 1st International Workshop on Data Analysis Patterns in Software Engineering 》|2013年|1-3|共3页
会议地点 San Francisco CA(US)
作者
Schulam Peter; Rosenfeld Roni; Devanbu Premkumar;
展开▼
作者单位

Language Technologies Institute, Carnegie Mellon University, USAc;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Building Statistical Language Models for Persian Continuous Speech Recognition Systems Using the Peykare Corpus [J] . Mohammad Bahrani, Hossein Sameti International journal of computer processing of languages . 2011 ,第1期

机译：使用Peykare语料库为波斯语连续语音识别系统建立统计语言模型
2. Code Completion with Statistical Language Models [J] . Veselin Raychev, Martin Vechev, Eran Yahav ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2014 ,第6期

机译：用统计语言模型完成代码
3. An empirical study of statistical language models: n-gram language models vs. neural network language models [J] . Freha Mezzoudj, Abdelkader Benyettou International Journal of Innovative Computing and Applications . 2018 ,第4期

机译：统计语言模型的实证研究：n-gram语言模型与神经网络语言模型
4. Building Statistical Language Models of code [C] . Schulam Peter, Rosenfeld Roni, Devanbu Premkumar International Workshop on Data Analysis Patterns in Software Engineering . 2013

机译：建立统计语言代码模型
5. Evaluation of state-of-the-art parametric building wake models using computational fluid dynamics (CFD) computer codes, and development of building wake, plume rise, and dispersion models. [D] . Restrepo, Louis Fernando. 2000

机译：使用计算流体力学（CFD）计算机代码评估最新的参数化建筑物尾迹模型，并开发建筑物尾迹，羽状上升和弥散模型。
6. Using a Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon to Assign SNOMED CT Codes to Anatomic Sites and Pathologic Diagnoses in Full Text Pathology Reports [O] . Henry J. Lowe, Yang Huang, Donald P. Regula 2009

机译：使用带有UMLS专家词典增强的统计自然语言解析器为全文病理报告中的解剖部位和病理诊断分配SNOMED CT代码
7. A Possibilistic Approach for Building Statistical Language Models [O] . Saeedeh Momtazi, Hossein Sameti 2013

机译：构建统计语言模型的可能性方法
8. Comparison of the Seismic Provisions of Model Building Codes of Model Building Codes and Standards to the 1997 NEHRP Recommended Provisions. [R] . Ghosh, S. K., Khuntia, M. 2001

机译：建模规范和标准模型建立规范的抗震规定与1997年NEHRp建议规定的比较。

Building Statistical Language Models of code

摘要

著录项

相似文献

相关主题

期刊订阅