首页> 外文期刊>Journal of Computing in Civil Engineering >Automatic Extraction of Apparent Semantic Structure from Text Contents of a Structural Calculation Document
【24h】

Automatic Extraction of Apparent Semantic Structure from Text Contents of a Structural Calculation Document

机译:从结构计算文档的文本内容中自动提取表观语义结构

获取原文
获取原文并翻译 | 示例
           

摘要

A generic method for the automatic extraction of apparent semantic document structure from a structural calculation document was proposed in this paper. The method consists of two processes: extracting subtitles and classifying depth levels of the subtitles. The subtitles become tree nodes of the apparent semantic structure. A context model of technical documents was built for the subtitle extraction from plain text information. In addition, a formal classification method for the determination of depth levels of the subtitles was developed and used to build a document tree with sequentially ordered subtitles. An application module of the proposed method, which transforms a plain text document into a semi structured XML document, was implemented. Performance of the developed application module was also evaluated with 40 test documents including structural calculation documents, technical reports, and theses.
机译:提出了一种从结构计算文档中自动提取表观语义文档结构的通用方法。该方法包括两个过程:提取字幕和对字幕的深度级别进行分类。字幕成为表面语义结构的树节点。建立了技术文档上下文模型,用于从纯文本信息中提取字幕。另外,开发了用于确定字幕的深度级别的正式分类方法,并将其用于构建具有顺序排序的字幕的文档树。实现了该方法的应用模块,该模块将纯文本文档转换为半结构化XML文档。还使用40个测试文件(包括结构计算文件,技术报告和这些内容)对开发的应用程序模块的性能进行了评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号