首页> 外文会议> >Knowledge acquisition and representation for document structure recognition: The CAROL Project
【24h】

Knowledge acquisition and representation for document structure recognition: The CAROL Project

机译:用于文档结构识别的知识获取和表示:CAROL项目

获取原文

摘要

The authors describe a rule based recognition system to rebuild the structure of paper documents. This method is applied to an automatic cataloging system to be used in libraries. Documents are scanned and run through a character recognition engine. The result of the character recognition process is an output format with additional layout information serving as input for the rule interpreter. Rules for a specific document type are generated by a learning module, which enables the user to create a set of rules for a new document type. The learning component uses several generalization rules which can also be found in machine learning systems. CAROL, a demonstration prototype, is currently being tested by librarians.
机译:作者描述了一种基于规则的识别系统来重建纸质文档的结构。此方法适用于要在图书馆中使用的自动编目系统。扫描文档并通过字符识别引擎运行。字符识别过程的结果是一种输出格式,其中附加的布局信息用作规则解释器的输入。特定文档类型的规则由学习模块生成,该模块使用户能够为新文档类型创建一组规则。学习组件使用几种通用规则,这些规则也可以在机器学习系统中找到。示范原型CAROL目前正在由图书馆员进行测试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号