首页> 外文会议>International Workshop on Database and Expert Systems Applications >On automatic similarity linking in digital libraries
【24h】

On automatic similarity linking in digital libraries

机译:论数字图书馆中的自动相似性

获取原文

摘要

Hypertext links are a powerful extension of standard information retrieval techniques based on query languages. However the generation of links is often impractical due to large manual and/or computational effort. We analyze the effects of two main approaches that aim at a restriction of the necessary efforts: the direct use of OCR-processed documents instead of manually post-processed, i.e. corrected documents; and the use of shorter excerpts of documents instead of complete documents. For our tests, similarity links were computed based on the vector-space model; the links that are generated based on unmodified OCR documents and excerpts of documents are then compared to those links that are generated based on complete documents without OCR errors.
机译:超文本链接是基于查询语言的标准信息检索技术的强大扩展。然而,由于大量手册和/或计算努力,链接的产生往往是不切实际的。我们分析了两种主要方法的影响,这些方法旨在限制必要的努力:直接使用OCR处理的文件而不是手动后处理的文件,即纠正文件;并且使用更短的文件摘要而不是完整的文件。对于我们的测试,基于矢量空间模型计算的相似性链接;然后将基于未修改的OCR文档和文档摘录生成的链路与基于完整文档生成的链接进行比较,没有OCR错误。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号