首页> 外文会议> >ODIL: an SGML description language of the layout structure of documents
【24h】

ODIL: an SGML description language of the layout structure of documents

机译:ODIL:文档布局结构的SGML描述语言

获取原文

摘要

This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL"-Office Document Image description Language-that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML elements, and their characteristics are defined by SGML attributes. The basic objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language: texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DTD will permit to use SGML tools for the logical structure recognition which is viewed as an SGML up-conversion problem.
机译:本文描述了用于文档识别原型输出的SGML中的编码格式。我们的建议是一种名为“ ODIL”的DTD(Office文档图像描述语言),该语言精确描述了包括OCR在内的所有识别阶段之后的文档布局结构。文档的所有布局对象均以SGML元素的形式定义,其特征由SGML属性定义。基本对象是包含同类信息的块。 ODIL语言支持五种类型的信息:文本,图片,线条图形,表格,数学公式。识别结果的ODIL表示非常适合于进一步的逻辑结构识别。从ODIL DTD开始并使用RAINBOW转换DTD将允许使用SGML工具进行逻辑结构识别,这被视为SGML上转换问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号