A Fast Algorithm for Plagiarism Detection in Large-scale Data

Kensuke Baba

首页> 外文期刊>Journal of digital information management >A Fast Algorithm for Plagiarism Detection in Large-scale Data

【24h】

A Fast Algorithm for Plagiarism Detection in Large-scale Data

机译：大规模数据抄袭检测的快速算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a fast plagiarism detection algorithm in large-scale data. Plagiarisms of superficial descriptions, such as "copy and paste", can be detected using a simple document similarity based on string matching. The algorithm reduces the effort for computing the document similarity by approximating the similarity. The effects of the approximation on the processing time and accuracy are evaluated by conducting experiments with a data set generated from practical scholarly documents. The experimental results show that the algorithm based on the approximated similarity can reduce the processing time of the straightforward algorithm based on the exact similarity to less than one-third in exchange for a slight decrease of the accuracy.

机译：提出了一种大规模数据的快速窃检测算法。可以使用基于字符串匹配的简单文档相似性来检测表面描述的抄袭，例如“复制和粘贴”。该算法通过近似相似度来减少计算文档相似度的工作量。近似值对处理时间和准确度的影响是通过使用从实际学术文献生成的数据集进行实验来评估的。实验结果表明，基于近似相似度的算法可以将基于精确相似度的简单算法的处理时间减少到不到三分之一，以换取精度的轻微降低。

著录项

来源
《Journal of digital information management》 |2017年第6期|331-338|共8页
作者
Kensuke Baba;
展开▼
作者单位

Fujitsu Laboratories Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Plagiarism Detection; Approximate String Matching; Vector Representation of Words; Discrete Fourier Transform;

机译：抄袭检测;近似字符串匹配;单词的向量表示;离散傅立叶变换;
入库时间 2022-08-17 13:49:14

相似文献

外文文献
中文文献
专利

1. FastMFDs: a fast, efficient algorithm for mining minimal functional dependencies from large-scale distributed data with Spark [J] . Cheng Feng, Yang Zhe Journal of supercomputing . 2019,第5期

机译：FastMFDs：一种快速有效的算法，可通过Spark从大型分布式数据中挖掘最小的功能依赖性
2. FastMFDs: a fast, efficient algorithm for mining minimal functional dependencies from large-scale distributed data with Spark [J] . Cheng Feng, Yang Zhe Journal of supercomputing . 2019,第5期

机译：FASTMFDS：一种快速，高效的算法，用于挖掘大规模分布式数据的最小功能依赖性与火花
3. Evaluation of the SHAPD2 Algorithm Efficiency in Plagiarism Detection Task Using PAN Plagiarism Corpus [J] . Dariusz Ceglarek Computer Science & Information Technology . 2013,第3期

机译：用PAN抄袭语料库评估抄袭检测任务的ShapD2算法效率
4. Plagiarism detection on bigdata using modified map-reduced based SCAM algorithm [C] . Jayshree Dwivedi, Abhigyan Tiwary 2017 International Conference on Innovative Mechanisms for Industry Applications . 2017

机译：使用改进的基于地图约简的SCAM算法对大数据进行gi窃检测
5. Fast oscillation monitoring algorithms for large-scale power systems using synchrophasor data. [D] . Wu, Tianying. 2016

机译：使用同步相量数据的大型电力系统快速振荡监控算法。
6. Vital Sign Detection during Large-Scale and Fast Body Movements Based on an Adaptive Noise Cancellation Algorithm Using a Single Doppler Radar Sensor [O] . Zi-Kai Yang, Heping Shi, Sheng Zhao, 2020

机译：基于自适应噪声消除算法的大规模和快速体运动的重要符号检测使用单一多普勒雷达传感器
7. Insider Threat Control: Using Plagiarism Detection Algorithms to Prevent Data Exfiltration in Near Real Time [O] . Lewellen, Todd, Silowash, George J, Costa, Daniel L. 2013

机译：内部威胁控制：使用Pla窃检测算法防止近实时数据泄漏
8. Insider Threat Control: Using Plagiarism Detection Algorithms to Prevent Data Exfiltration in Near Real Time. [R] . Lewellen, T., Silowash, G. J., Costa, D. 2013

机译：内部威胁控制：使用抄袭检测算法防止近实时数据泄漏。

A Fast Algorithm for Plagiarism Detection in Large-scale Data

摘要

著录项

相似文献

相关主题

期刊订阅