首页> 外文会议>International Conference on Data and Software Engineering >Java archives search engine using byte code as information source

Java archives search engine using byte code as information source




Information from computer programs can be extracted from its source code, external documentation, and compiled code. Although compiled code is an assured information source which is always exists in published computer programs, it is seldom used by the existing search engines since some reverse engineering tasks are needed. In this research, a search engine for Java archives that uses byte code (compiled code for Java Archive) as its information source is developed. It enables user to search within a collection of Java Archives without relying with source code and external documentation. Compared with Penta and FindJar [2] [7], A novel term extraction process beyond the file and class name is proposed, which includes field name, method name, string literal used in program, program flow weighting, and method expansion. Exclusive tokenization, stopping, and stemming are also implemented to improve effectiveness. Based on evaluation, it has a fairly good effectiveness although it may vary based on terms stored on index. Its effectiveness is higher than FindJar main features reimplementation which indicates that detailed compiled code has positive influences in computer programs search engine. Efficiency depends on how many terms stored on index and how many process used at certain step.
机译:可以从计算机程序的源代码,外部文档和编译的代码中提取信息。尽管已编译的代码是有保证的信息源,并且始终存在于已发布的计算机程序中,但是由于需要一些逆向工程任务,因此现有的搜索引擎很少使用已编译的代码。在这项研究中,开发了一种Java档案的搜索引擎,该引擎使用字节码(Java档案的编译代码)作为其信息源。它使用户可以在Java档案库的集合中进行搜索,而不必依赖源代码和外部文档。与Penta和FindJar [2] [7]相比,提出了一种新颖的术语提取过程,其范围超出了文件和类名,包括字段名,方法名,程序中使用的字符串文字,程序流加权和方法扩展。排他的标记化,停止和阻止也已实施以提高有效性。根据评估,尽管它可能会根据存储在索引上的术语而有所不同,但效果还是相当不错的。其有效性高于FindJar主要功能的重新实现,后者表明详细的编译代码对计算机程序搜索引擎具有积极影响。效率取决于索引上存储了多少项以及在某个步骤中使用了多少个过程。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号