Scalable Source Code Plagiarism Detection Using Source Code Vectors Clustering

机译：可扩展源代码抄袭使用源代码向量群集群集

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Nowadays, the plagiarism is a growing problem due to a lot of easily accessible resources on-line. New algorithms are constantly being developed, but there are not currently many systems, that could be used for successful plagiarism detection in large source files databases. Aim of our work is to deal with plagiarism on a large scale. This paper describes our new scalable approach to the detection of plagiarism in source code in the academic environment. The aim of the algorithm is to search for plagiarism in a huge number of source code files. An incremental clustering approach is applied to achieve modularity and scalability of the algorithm. The paper also details structures of data persistence and methods of searching for source code snippet matches. In addition, we present some results of this approach on real student submissions and compare the results with other detection systems.

机译：如今，由于大量易于访问的资源在线，抄袭是一种不断增长的问题。新算法不断开发，但目前没有许多系统，可用于大源文件数据库中的成功抄袭检测。我们的作品的目标是在大规模上处理抄袭。本文介绍了我们在学术环境中源代码中检测抄袭的新可扩展方法。算法的目的是在大量的源代码文件中搜索抄袭。应用增量聚类方法来实现算法的模块化和可扩展性。本文还详细了解数据持久性和搜索源代码片段匹配的方法。此外，我们在真实的学生提交时提供了这种方法的一些结果，并将结果与其他检测系统进行比较。

著录项

来源
《IEEE International Conference on Software Engineering and Service Science》|2018年|579p|共4页
会议地点
作者
Michal ?ura?ík; Emil Kr?ák; Patrik Hrkút;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词
Plagiarism; Databases; Clustering algorithms; Tools; Manuals; Reactive power; Syntactics;

机译：抄袭;数据库;聚类算法;工具;手册;无功功率;句法;

相似文献

外文文献
中文文献
专利

1. Efficient clustering-based source code plagiarism detection using PIY [J] . Ohmann Tony, Rahal Imad Knowledge and information systems . 2015,第2期

机译：使用PIY的基于聚类的高效源代码窃检测
2. Pde4java: Plagiarism Detection Engine For Java Source Code: A Clustering Approach [J] . Ameera Jadalla, Ashraf Elnagar International Journal of Business Intelligence and Data Mining . 2008,第2期

机译：Pde4java：Java抄袭检测引擎源代码：聚类方法
3. Plagiarism Detection for Java Programs without Source Codes [J] . V. Anjali, T.R. Swapna, Bharat Jayaraman Procedia Computer Science . 2015,第1期

机译：没有源代码的Java程序的抄袭检测
4. Scalable Source Code Plagiarism Detection Using Source Code Vectors Clustering [C] . Michal Ďuračík, Emil Kršák, Patrik Hrkút IEEE International Conference on Software Engineering and Service Science . 2018

机译：使用源代码向量聚类的可伸缩源代码Pla窃检测
5. Automated detection of source code plagiarism. [D] . Nadipineni, Narendranadh. 2015

机译：自动检测源代码抄袭。
6. Rate-Distortion Function Upper Bounds for Gaussian Vectors and Their Applications in Coding AR Sources [O] . Jesús Gutiérrez-Gutiérrez, Marta Zárraga-Rodríguez, Fernando M. Villar-Rosety, 2018

机译：高斯向量的速率失真函数上限及其在编码AR源中的应用
7. Plagiarism Detection for Java Programs without Source Codes [O] . Anjali V., Swapna T.R., Jayaraman Bharat 2015

机译：没有源代码的Java程序的抄袭检测

Scalable Source Code Plagiarism Detection Using Source Code Vectors Clustering

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅