首页> 外文会议>IEEE International Conference on Software Engineering and Service Science >Scalable Source Code Plagiarism Detection Using Source Code Vectors Clustering
【24h】

Scalable Source Code Plagiarism Detection Using Source Code Vectors Clustering

机译:可扩展源代码抄袭使用源代码向量群集群集

获取原文

摘要

Nowadays, the plagiarism is a growing problem due to a lot of easily accessible resources on-line. New algorithms are constantly being developed, but there are not currently many systems, that could be used for successful plagiarism detection in large source files databases. Aim of our work is to deal with plagiarism on a large scale. This paper describes our new scalable approach to the detection of plagiarism in source code in the academic environment. The aim of the algorithm is to search for plagiarism in a huge number of source code files. An incremental clustering approach is applied to achieve modularity and scalability of the algorithm. The paper also details structures of data persistence and methods of searching for source code snippet matches. In addition, we present some results of this approach on real student submissions and compare the results with other detection systems.
机译:如今,由于大量易于访问的资源在线,抄袭是一种不断增长的问题。新算法不断开发,但目前没有许多系统,可用于大源文件数据库中的成功抄袭检测。我们的作品的目标是在大规模上处理抄袭。本文介绍了我们在学术环境中源代码中检测抄袭的新可扩展方法。算法的目的是在大量的源代码文件中搜索抄袭。应用增量聚类方法来实现算法的模块化和可扩展性。本文还详细了解数据持久性和搜索源代码片段匹配的方法。此外,我们在真实的学生提交时提供了这种方法的一些结果,并将结果与​​其他检测系统进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号