...
首页> 外文期刊>Multimedia Tools and Applications >Optimal feature and classifier selection for text region classification in natural scene images using Weka tool
【24h】

Optimal feature and classifier selection for text region classification in natural scene images using Weka tool

机译:使用Weka工具在自然场景图像中进行文本区域分类的最佳特征和分类器选择

获取原文
获取原文并翻译 | 示例
           

摘要

The problem of text detection and localization in scene images has always been challenging for the researchers over the years due to diversities present in these images. This diversity includes variation in fonts, size, color, different backgrounds, etc. The textual content in such images can be helpful for humans in many different domains like visually impaired people, scene understanding, intelligent navigation, etc. The natural scene contains some non-text objects along with relevant text objects, and it is necessary to classify them appropriately & accurately to increase the performance of the detection and localization method. The classification of text regions in scene images depends on the selection of optimal features and optimal classifier. This work contributes to finding both the optimal feature set and the optimal classifier with the help of weka tool. In this paper, first, we detect the possible text regions with the help of the improved MSER algorithm; then, we extract 11 features on these potential text regions. From these 11 features, we choose an optimal feature set for discrimination between text and non-text components with the help of the CfsSubsetEval and BFS parameter of the Weka Tool. We trained several classifiers using these optimal features with the help of Weka tool on the ICDAR 2013 training set. The performance of these classifiers is compared empirically based on the classification accuracy obtained using Weka tool. Based on this empirical estimation, Naive Bayes Classifier with the highest accuracy of 92.5% is proposed as an optimal choice for classification purpose.
机译:多年来,由于这些图像中存在多样性,因此场景研究中的文本检测和本地化问题一直是研究人员所面临的挑战。这种多样性包括字体,大小,颜色,不同背景等的变化。此类图像中的文本内容可对许多不同领域的人们有所帮助,例如视力障碍者,场景理解,智能导航等。自然场景包含一些非文本对象以及相关的文本对象,因此有必要对它们进行适当而准确的分类,以提高检测和定位方法的性能。场景图像中文本区域的分类取决于最佳特征和最佳分类器的选择。这项工作有助于在weka工具的帮助下找到最佳特征集和最佳分类器。在本文中,首先,我们借助改进的MSER算法检测可能的文本区域。然后,我们在这些潜在的文本区域上提取11个特征。从这11个功能中,我们借助Weka工具的CfsSubsetEval和BFS参数选择一个最佳的功能集来区分文本和非文本组件。我们在ICDAR 2013训练集上借助Weka工具使用这些最佳功能训练了多个分类器。基于使用Weka工具获得的分类准确性,经验比较这些分类器的性能。基于这一经验估计,提出了最准确的92.5%朴素贝叶斯分类器作为分类目的的最佳选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号