首页> 外文会议>IEEE/ACM International Conference on Mining Software Repositories >SeSaMe: A Data Set of Semantically Similar Java Methods
【24h】

SeSaMe: A Data Set of Semantically Similar Java Methods

机译:芝麻:语义类似的Java方法的数据集

获取原文

摘要

In the past, techniques for detecting similarly behaving code fragments were often only evaluated with small, artificial oracles or with code originating from programming competitions. Such code fragments differ largely from production codes. To enable more realistic evaluations, this paper presents SeSaMe, a data set of method pairs that are classified according to their semantic similarity. We applied text similarity measures on JavaDoc comments mined from 11 open source repositories and manually classified a selection of 857 pairs.
机译:过去,用于检测类似行为的代码片段的技术通常仅被小,人工oracles或源自编程竞争的代码进行评估。这些代码片段在很大程度上不同于生产代码。为了实现更现实的评估,本文提出了芝麻,根据其语义相似性对的方法对的数据集。我们应用于从11个开源存储库中挖掘的Javadoc评论的文本相似度措施,并手动分类为857对的选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号