首页> 外文会议>IEEE International Conference on computer supported cooperative work in design >A Crowd Science framework to support the construction of a Gold Standard Corpus for Plagiarism Detection
【24h】

A Crowd Science framework to support the construction of a Gold Standard Corpus for Plagiarism Detection

机译:一个人群科学框架,以支持构建Pla窃检测黄金标准语料库

获取原文

摘要

The construction of a Gold Standard Corpus for Plagiarism Detection (GSCPD) is a challenging task for reproducible research in computer science, given that there is a trade off between the time expended by the experts and the size, quality, and reliability of a GSCPD. In such a challenging scenario, this paper describes a framework to support the construction of a GSCPD in any language. Aiming for reproducibility and scalability, the framework involves a data acquisition process and a Crowd Science project that employs human processing power to identify plagiarism in pairs of textual data extracted via the data acquisition process. This papers also presents the application of this framework in Portuguese language and the preliminary results of a feasibility study about the use of a tool that composes the framework.
机译:鉴于专家花费的时间与GSCPD的大小,质量和可靠性之间需要权衡取舍,构建Pla窃检测黄金标准语料库(GSCPD)对于计算机科学的可重复性研究而言是一项艰巨的任务。在这种充满挑战的情况下,本文描述了一种支持以任何语言构建GSCPD的框架。为了实现可重现性和可伸缩性,该框架涉及一个数据采集过程和一个Crowd Science项目,该项目利用人工处理能力来识别通过数据采集过程提取的成对文本数据中的抄袭。本文还介绍了该框架在葡萄牙语中的应用以及有关使用构成该框架的工具的可行性研究的初步结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号