首页> 外文期刊>Science of Computer Programming >Source-code queries with graph databases-with application to programming language usage and evolution
【24h】

Source-code queries with graph databases-with application to programming language usage and evolution

机译:图形数据库的源代码查询-应用于编程语言的使用和演变

获取原文
获取原文并翻译 | 示例
       

摘要

Program querying and analysis toots are of growing importance, and occur in two main variants. Firstly there are source-code query languages which help software engineers to explore a system, or to find code in need of refactoring as coding standards evolve. These also enable language designers to understand the practical uses of language features and idioms over a software corpus. Secondly there are program analysis tools in the style of Coverity which perform deeper program analysis searching for bugs as well as checking adherence to coding standards such as MISRA. The former class are typically implemented on top of relational or deductive databases and make ad-hoc trade-offs between scalability and the amount of source-code detail held-with consequent limitations on the expressiveness of queries. The latter class are more commercially driven and involve more ad-hoc queries over program representations, nonetheless similar pressures encourage user-visible domain-specific languages to specify analyses. We argue that a graph data model and associated query language provides a unifying conceptual model and gives efficient scalable implementation even when storing full source-code detail. It also supports overlays allowing a query DSL to pose queries at a mixture of syntax-tree, type, control-flow-graph or data-flow levels. We describe a prototype source-code query system built on top of Neo4j using its Cypher graph query language; experiments show it scales to multi-million-line programs while also storing full source-code detail.
机译:程序查询和分析嘟嘟声越来越重要,并且有两种主要形式。首先,存在源代码查询语言,这些语言可帮助软件工程师探索系统,或随着编码标准的发展找到需要重构的代码。这些还使语言设计人员能够了解软件集上语言功能和惯用语的实际用法。其次,还有一些Coverage风格的程序分析工具,它们可以进行更深入的程序分析,以查找错误并检查对MISRA等编码标准的遵守情况。前一类通常在关系或演绎数据库的顶部实现,并在可伸缩性和所保存的源代码详细信息量之间进行临时权衡,从而限制了查询的表达能力。后者是更商业化的驱动程序,涉及对程序表示的更多临时查询,尽管如此,类似的压力仍在鼓励用户可见的特定于域的语言来指定分析。我们认为,即使存储完整的源代码详细信息,图数据模型和关联的查询语言也可以提供统一的概念模型并提供有效的可伸缩实现。它还支持叠加层,允许查询DSL以语法树,类型,控制流图或数据流级别的混合形式提出查询。我们使用其Cypher图查询语言描述了基于Neo4j构建的原型源代码查询系统;实验表明,它可以扩展到数百万行的程序,同时还可以存储完整的源代码详细信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号