Most of previously released plagiarism detection tools and systems have been adopted normalized similarity measures which are unreliably sensitive to the size of programs compared. Also the most of previously announced tools have difficulties in determining the cutoff threshold to discriminate the plagiarized codes from innocent ones. In this paper, we present a new discriminating method based on Weibull distribution which was mainly used in studying genomic sequence similarity with statistical significance. We applied our new detection method to a real programming competition, ICPC East Asia Regional Contest. Our system was quite successful to detect a few plagiarized codes in the preliminary round of the contest. This experience clearly revealed the characteristics of similarity among the source codes submitted in programming contests.
展开▼