对开源代码进行准确搜索是实现代码复用的前提.在基于关键字搜索的研究中,目前只关注匹配方法签名.结合源代码注释对方法功能的语义描述,提出结合代码注释的关键字搜索方法.通过生成源代码抽象语法树,从中识别方法签名与各类型注释等组合代码特征;将代码特征与查询语句分别用向量表示,并计算向量间的余弦相似度,然后制定针对搜索结果多特征权重分配的评分机制.根据评分对搜索结果进行排序,得到与查询语句相关的结果序列.实验结果表明,多个代码特征在不同权重影响下可以提升源代码搜索准确度.%It is a precondition of achieving code reuse to search open source code accurately.The current methods based on keyword search only concem matching function signatures.Considering the source code comments on the semantic description of the method's function,a method based on keyword search was proposed,which took into account code comments.The features of code,such as function signatures and different types of comments,were identified from the generated abstract syntax tree of source code;the code features and query statements were transformed into vectors respectively,and then based on the cosine similarity between the vectors,the scoring mechanism of multi-feature weight assignment to the results was created.According to the scores,an ordered list of relevant functions was obtained that reflects the associations between code features in the functiom and a query.The experimental results demonstrate that the accuracy of search results can be improved by using multiple code features with different weights.
展开▼