...
首页> 外文期刊>Expert Systems with Application >An effective and conceptually simple feature representation for off-line text-independent writer identification
【24h】

An effective and conceptually simple feature representation for off-line text-independent writer identification

机译:一种有效且概念上简单的特征表示,用于离线文本无关的作者识别

获取原文
获取原文并翻译 | 示例
           

摘要

Feature engineering forms an important component of machine learning and pattern recognition. It is a fundamental process for off-line writer identification of handwritten documents, which continues to be an interesting subject of research in various forensic and authentication areas. In this work, we propose an efficient, yet computationally and conceptually simple framework for off-line text independent writer identification using local textural features in characterizing the writing style of each writer. These include Local Binary Patterns (LBP), Local Ternary Patterns (LIP), and Local Phase Quantization (LPQ). Our approach focuses on exploiting the writing images at small observation regions where a set of connected component sub-images are cropped and extracted from each handwriting sample (document or set of word text line images). These connected components are seen as texture images where each one of them is subjected to feature extraction using LBP, LPQ or LTP. Then, a histogram sequence concatenation is applied to the feature image after dimensionality reduction followed by image subdivision into a number of non-overlapping regions. For classification, the 1-NN (Nearest Neighbor) classifier is used to identify the writer of the questioned samples based on the dissimilarity of feature vectors computed from all components in the writing. Experiments on IFN/ENIT (411 writers/Arabic), AHTID/MW (53 writers/Arabic), CVL (309 writers/English), and IAM (657 writers/English) databases demonstrate that our proposed system outperforms old and recent state-of-the-art writer identification systems on Arabic script, and demonstrates a competitive performance on English ones. (C) 2019 Elsevier Ltd. All rights reserved.
机译:特征工程是机器学习和模式识别的重要组成部分。这是离线作者识别手写文档的基本过程,在各种取证和认证领域,它仍然是一个有趣的研究主题。在这项工作中,我们提出了一个有效的,在计算上和概念上都很简单的框架,用于使用本地纹理特征来表征每个作者的写作风格,从而进行离线文本独立的作者识别。这些包括本地二进制模式(LBP),本地三进制模式(LIP)和本地相位量化(LPQ)。我们的方法着重于在小的观察区域利用书写图像,在这些观察区域中裁剪并从每个手写样本(文档或一组单词文本行图像)中提取出一组连接的组件子图像。这些连接的组件被视为纹理图像,其中每个组件都使用LBP,LPQ或LTP进行了特征提取。然后,将直方图序列级联应用于降维后的特征图像,然后将图像细分为多个非重叠区域。为了进行分类,使用1-NN(最近邻)分类器根据从写作中所有成分计算出的特征向量的不相似性来识别所质疑样本的作者。对IFN / ENIT(411位作者/阿拉伯语),AHTID / MW(53位作者/阿拉伯语),CVL(309位作者/英语)和IAM(657位作者/英语)和IAM(657位作者/英语)数据库进行的实验表明,我们提出的系统优于旧的和最新的状态-最先进的阿拉伯文字识别系统,并证明了其在英语上的竞争力。 (C)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号