...
首页> 外文期刊>WSEAS Transactions on Computers >Principal Components and Invariant Moments Analysis-Font Recognition Applied
【24h】

Principal Components and Invariant Moments Analysis-Font Recognition Applied

机译:主成分和不变矩分析-字体识别的应用

获取原文
获取原文并翻译 | 示例
           

摘要

An alternative Font Recognition (FR) methodology is proposed in this work; this methodology is based on the global texture analysis of document text using invariant moments (invariant to scale, rotation and translation) and Principal Components Analysis. There is not necessary of confined analysis for each letter or word. In our method, the central moment features are extracted as a global characteristic from each font (from window estimation) and over the same window, the principal components analysis is applied. An uniform text block with a unique font is suitable to provide the specific font properties necessary for the process of recognition. The proposed scheme were tested with the following fonts: Courier, Arial, Bookman Old Style, Franklin Gothic Medium, Comic Sans, Impact, Modern and Times New Roman; and their respective styles: regular, italic, bold, italic with bold. The invariant moment technique is used in this study to extract the font characteristics by window size estimation; from an entry text set a data base was build for the learning stage, and then standard statistical classifiers were applied for the identification stage. We found that the invariant moments combined with principal component analysis give excellent results in font identification task. An exhaustive study was performed with 8 types of fonts commonly used in the Spanish (Mexican) language. Each type of font can have four styles leading to 32 font combinations total, additionally, our results include tests over different font sizes: 6, 8, 10 and 12 points and different orientations: 0, 45, 90 and 135 degrees The robustness of the algorithm is also examined in terms of Gaussian noise.
机译:这项工作中提出了另一种字体识别(FR)方法。该方法基于文档文本的全局纹理分析,该分析使用不变矩(缩放,旋转和平移不变)和主成分分析。不必对每个字母或单词进行限制分析。在我们的方法中,从每个字体(从窗口估计)提取中心矩特征作为全局特征,并在同一窗口上应用主成分分析。具有唯一字体的统一文本块适合提供识别过程所需的特定字体属性。提议的方案已使用以下字体进行了测试:Courier,Arial,Bookman Old Style,Franklin Gothic Medium,Comic Sans,Impact,Modern和Times New Roman;及其各自的样式:常规,斜体,粗体,斜体和粗体。本研究采用不变矩技术通过窗口大小估计提取字体特征。从输入文本集中为学习阶段建立数据库,然后将标准统计分类器应用于识别阶段。我们发现不变矩与主成分分析相结合在字体识别任务中给出了出色的结果。用西班牙(墨西哥)语言中常用的8种字体进行了详尽的研究。每种类型的字体可以具有四种样式,导致总共32种字体组合,此外,我们的结果包括对不同字体大小的测试:6、8、10和12点以及不同的方向:0、45、90和135度还根据高斯噪声对算法进行了检验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号