首页> 外文会议>PART'99 >Parallel Overlap and Similarity Detection in Semi-Structured Document Colections
【24h】

Parallel Overlap and Similarity Detection in Semi-Structured Document Colections

机译:半结构化文档集中的并行重叠和相似度检测

获取原文
获取原文并翻译 | 示例

摘要

Proliferation of digital libraries plus high availability of electronic documents from the Internet have created new challenges for computer science researchers and professionals. This paper discusses the problems of using parallel and cluster computing systems for detecting plagiarism in large collections of semi-structured electronic texts, including software written in formal languages at one end of the spectrum and natural language texts at the other end. The main component of the system is using string matching algorithms and suffix trees. Implementation and performance issues are also discussed.
机译:数字图书馆的激增以及Internet上电子文档的高可用性给计算机科学研究人员和专业人员带来了新的挑战。本文讨论了使用并行和群集计算系统检测大量半结构化电子文本中的窃问题,其中包括在频谱的一端使用正式语言编写的软件以及在另一端使用自然语言编写的软件。系统的主要组件是使用字符串匹配算法和后缀树。还讨论了实现和性能问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号