首页> 外文会议>International Conference on Frontiers in Handwriting Recognition >Hand-Written and Machine-Printed Text Classification in Architecture, Engineering Construction Documents
【24h】

Hand-Written and Machine-Printed Text Classification in Architecture, Engineering Construction Documents

机译:建筑,工程和施工文件中的手写和机器印刷文本分类

获取原文

摘要

In AEC (Architecture, Engineering & Construction) industry, drawing documents are used as a blueprint to facilitate the construction process. It is also represented as a graphical language that communicates ideas and information from one mind to another. In AEC documents, text is present in Machine-printed and hand-written format. Since the algorithms for recognition of machine-printed and hand-written texts are different, it is important to distinguish between these two types of texts before sending the document to respective recognition system. In this paper we proposed a novel approach for the classification machine-printed and hand-written text from AEC Documents. Before Classification Hand-Written and Machine-Printed text from the documents our system used some preprocessing which includes binarization, text graphics separation and word segmentation. The Words are segmented based on certain structural properties of Isothetic Covers (IC) tightly enclosing the words in a document. The grid size properties of IC are selected by some statistical analysis of connected component of the document. Then Word level Gabor Filter based features are extracted with spooling information for classification. A standard classifier based on SVM is used to classify the text. This task is performed at word level of AEC documents and we achieved an overall accuracy of 98.45%.
机译:在AEC(建筑,工程与建筑)行业中,工程图文件被用作简化施工过程的蓝图。它也被表示为一种图形语言,可以将思想和信息从一个思想传达到另一个思想。在AEC文档中,文本以机器打印和手写格式显示。由于用于识别机器打印和手写文本的算法不同,因此在将文档发送到相应的识别系统之前,区分这两种类型的文本很重要。在本文中,我们提出了一种从AEC文档对机器打印和手写文本进行分类的新颖方法。在从文档分类手写和机器打印文本之前,我们的系统使用了一些预处理,包括二进制化,文本图形分离和单词分段。单词是根据Isothetic Covers(IC)的某些结构特性进行细分的,这些单词将单词紧密地封装在文档中。 IC的网格大小属性是通过对文档的连接组件进行一些统计分析来选择的。然后,基于假脱机信息提取基于Word级Gabor过滤器的特征以进行分类。使用基于SVM的标准分类器对文本进行分类。此任务在AEC文档的单词级别执行,我们实现了98.45%的总体准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号