首页> 外文会议>International Conference on Software Maintenance and Evolution >Query by example in large-scale code repositories
【24h】

Query by example in large-scale code repositories

机译:在大型代码存储库中通过示例查询

获取原文

摘要

Searching code samples in a code repository is an important part of program comprehension. Most of the existing tools for code search support syntactic element search and regular expression pattern search. However, they are text-based and hence cannot handle queries which are syntactic patterns. The proposed solutions for querying syntactic patterns using specialized query languages present a steep learning curve for users. The querying would be more user-friendly if the syntactic pattern can be formulated in the underlying programming language (as a sample code snippet) instead of a specialized query language. In this paper, we propose a solution for the query by example problem using Abstract Syntax Tree (AST) structural similarity match. The query snippet is converted to an AST, then its subtrees are compared against AST subtrees of source files in the repository and the similarity values of matching subtrees are aggregated to arrive at a relevance score for each of the source files. To scale this approach to large code repositories, we use locality-sensitive hash functions and numerical vector approximation of trees. Our experimental evaluation involves running control queries against a real project. The results show that our algorithm can achieve high precision (0.73) and recall (0.81) and scale to large code repositories without compromising quality.
机译:在代码存储库中搜索代码样本是程序理解的重要部分。现有的大多数用于代码搜索的工具都支持语法元素搜索和正则表达式模式搜索。但是,它们是基于文本的,因此无法处理语法模式的查询。提出的使用专用查询语言查询句法模式的解决方案为用户提供了陡峭的学习曲线。如果可以使用基础编程语言(作为示例代码段)而不是专用查询语言来表述语法模式,则查询将更加用户友好。在本文中,我们提出了使用抽象语法树(AST)结构相似性匹配的示例查询问题的解决方案。将查询片段转换为AST,然后将其子树与资源库中源文件的AST子树进行比较,并汇总匹配子树的相似性值,以得出每个源文件的相关性得分。为了将这种方法扩展到大型代码存储库,我们使用了局部敏感的哈希函数和树的数字矢量近似。我们的实验评估涉及针对实际项目运行控制查询。结果表明,我们的算法可以实现高精度(0.73)和召回率(0.81),并且可以在不影响质量的情况下扩展到大型代码存储库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号