【24h】

Deep Code Search

机译:深度代码搜索

获取原文

摘要

To implement a program functionality, developers can reuse previously written code snippets by searching through a large-scale codebase. Over the years, many code search tools have been proposed to help developers. The existing approaches often treat source code as textual documents and utilize information retrieval models to retrieve relevant code snippets that match a given query. These approaches mainly rely on the textual similarity between source code and natural language query. They lack a deep understanding of the semantics of queries and source code. In this paper, we propose a novel deep neural network named CODEnn (Code-Description Embedding Neural Network). Instead of matching text similarity, CODEnn jointly embeds code snippets and natural language descriptions into a high-dimensional vector space, in such a way that code snippet and its corresponding description have similar vectors. Using the unified vector representation, code snippets related to a natural language query can be retrieved according to their vectors. Semantically related words can also be recognized and irrelevant/noisy keywords in queries can be handled. As a proof-of-concept application, we implement a code search tool named DeepCS using the proposed CODEnn model. We empirically evaluate DeepCS on a large scale codebase collected from GitHub. The experimental results show that our approach can effectively retrieve relevant code snippets and outperforms previous techniques.
机译:要实现程序功能,开发人员可以通过浏览大规模的代码库来重复使用先前写的代码片段。多年来,已经提出了许多代码搜索工具来帮助开发人员。现有方法通常经常将源代码视为文本文档,并利用信息检索模型来检索与给定查询匹配的相关代码片段。这些方法主要依赖于源代码和自然语言查询之间的文本相似性。他们缺乏对查询和源代码的语义的深刻理解。在本文中,我们提出了一个名为Codenn的新型神经网络(嵌入神经网络的码头描述)。代替代码片段和自然语言描述,代码片段及其相应描述具有类似的向量,而不是将代码片段和自然语言描述共同嵌入代码片段和自然语言描述。使用统一的矢量表示,可以根据其向量检索与自然语言查询相关的代码片段。也可以在语义相关的单词中识别和查询中的无关/噪声关键字。作为概念验证应用程序,我们使用所提出的Codenn模型实现名为Deptcs的代码搜索工具。我们在从GitHub收集的大规模码条上进行了凭证评估Deepcs。实验结果表明,我们的方法可以有效地检索相关的代码片段和优于以前的技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号