首页> 外文会议>International Conference on Advanced Computer Science and Information Systems >Dynamic Thresholding Mechanisms for IR-Based Filtering in Efficient Source Code Plagiarism Detection
【24h】

Dynamic Thresholding Mechanisms for IR-Based Filtering in Efficient Source Code Plagiarism Detection

机译:高效源代码抄袭检测中基于IR的动态阈值机制

获取原文

摘要

To solve time inefficiency issue, only potential pairs are compared in string-matching-based source code plagiarism detection; wherein potentiality is defined through a fast-yet-order-insensitive similarity measurement (adapted from Information Retrieval) and only pairs which similarity degrees are higher or equal to a particular threshold is selected. Defining such threshold is not a trivial task considering the threshold should lead to high efficiency improvement and low effectiveness reduction (if it is unavoidable). This paper proposes two three holding mechanisms-namely range-based and pair-count-based mechanism-that dynamically tune the threshold based on the distribution of resulted similarity degrees. According to our evaluation, both mechanisms are more practical to be used than manual threshold assignment since they are more proportional to efficiency improvement and effectiveness reduction.
机译:为了解决时间效率低下的问题,在基于字符串匹配的源代码抄袭检测中仅比较了潜在对。其中,通过快速排序不敏感的相似性度量(根据信息检索)定义潜力,并且仅选择相似度高于或等于特定阈值的对。考虑到该阈值应导致高效率提高和低效率降低(如果不可避免),因此定义此阈值并非易事。本文提出了两种保持机制,即基于范围的机制和基于对数的机制,它们根据结果相似度的分布动态调整阈值。根据我们的评估,与手动阈值分配相比,这两种机制都更实用,因为它们与效率提高和有效性降低成正比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号