首页> 外文会议>IEEE Technology and Engineering Management Conference - Europe >Improved plagiarism detection with collaboration network visualization based on source-code similarity
【24h】

Improved plagiarism detection with collaboration network visualization based on source-code similarity

机译:基于源代码相似性的协作网络可视化剽窃检测改进

获取原文

摘要

Plagiarism detection is a serious problem in higher education. Teachers use similarity (plagiarism) detection systems, which highlight similarities between student documents, to help them find plagiarism. Most systems are built for text but there are special systems to find similarities between source-code files. In most cases the results are presented in table form showing similarities between pairs of documents in descending order by similarity, and then a teacher is responsible for confirming which similar documents represent cases of plagiarism. While most systems present their results in the form of tables, only few of them present the results as a graph. Some studies indicate that using clustering algorithms to represent such data graphically can improve the speed and accuracy of finding potential instances of plagiarism in large collections of source-code files. The purpose of the study is to answer the following research questions. Can visualization of student solutions (of source-code similarities) in collaboration networks form help identify new cases of plagiarism? What are the steps to do so? The study was designed in a form of two case studies where one was performed on a graduate level university course and one on a course in professional studies. The article presents empirical results describing two cases where a collaboration network (based on source-code similarity) representation has been used. The article argues that the graphical presentation is able to identify new clusters of plagiarised source-code files that would have been missed using existing tabular presentation of data.
机译:剽窃检测是高等教育中的一个严重问题。教师使用相似性(剽窃)检测系统来帮助他们发现剽窃,该系统会突出学生文档之间的相似性。大多数系统都是为文本而构建的,但是有一些特殊的系统可以找到源代码文件之间的相似之处。在大多数情况下,结果以表格形式呈现,按相似性降序显示成对文件之间的相似性,然后教师负责确认哪些相似文件代表剽窃案例。虽然大多数系统以表格的形式呈现结果,但只有少数系统以图形的形式呈现结果。一些研究表明,使用聚类算法以图形方式表示此类数据可以提高在大量源代码文件集合中发现潜在剽窃实例的速度和准确性。本研究的目的是回答以下研究问题。在协作网络中可视化学生解决方案(源代码相似性)是否有助于识别新的剽窃案例?实现这一目标的步骤是什么?这项研究是以两个案例研究的形式设计的,其中一个是研究生水平的大学课程,另一个是专业研究课程。本文给出了两个使用协作网络(基于源代码相似性)表示的案例的实证结果。这篇文章认为,图形表示能够识别出新的剽窃源代码文件集群,如果使用现有的表格数据表示,这些文件可能会被遗漏。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号