首页> 外文期刊>Mechatronics, IEEE/ASME Transactions on >Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons
【24h】

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons

机译:基于便携式摄像机的盲人手持物体辅助文本和产品标签读取

获取原文
获取原文并翻译 | 示例
           

摘要

We propose a camera-based assistive text reading framework to help blind persons read text labels and product packaging from hand-held objects in their daily lives. To isolate the object from cluttered backgrounds or other surrounding objects in the camera view, we first propose an efficient and effective motion-based method to define a region of interest (ROI) in the video by asking the user to shake the object. This method extracts moving object region by a mixture-of-Gaussians-based background subtraction method. In the extracted ROI, text localization and recognition are conducted to acquire text information. To automatically localize the text regions from the object ROI, we propose a novel text localization algorithm by learning gradient features of stroke orientations and distributions of edge pixels in an Adaboost model. Text characters in the localized text regions are then binarized and recognized by off-the-shelf optical character recognition software. The recognized text codes are output to blind users in speech. Performance of the proposed text localization algorithm is quantitatively evaluated on ICDAR-2003 and ICDAR-2011 Robust Reading Datasets. Experimental results demonstrate that our algorithm achieves the state of the arts. The proof-of-concept prototype is also evaluated on a dataset collected using ten blind persons to evaluate the effectiveness of the system's hardware. We explore user interface issues and assess robustness of the algorithm in extracting and reading text from different objects with complex backgrounds.
机译:我们提出了一种基于摄像头的辅助文本阅读框架,以帮助盲人在日常生活中从手持对象中阅读文本标签和产品包装。为了将对象与杂乱的背景或相机视图中的其他周围对象隔离开来,我们首先提出一种有效且有效的基于运动的方法,通过要求用户摇动对象来定义视频中的关注区域(ROI)。该方法通过基于高斯混合的背景减法提取运动对象区域。在提取的ROI中,进行文本定位和识别以获取文本信息。为了从对象ROI自动定位文本区域,我们通过学习Adaboost模型中笔触方向的梯度特征和边缘像素的分布来提出一种新颖的文本定位算法。然后将本地化文本区域中的文本字符二值化,并通过现成的光学字符识别软件进行识别。识别出的文本代码以语音输出给盲人。在ICDAR-2003和ICDAR-2011稳健读取数据集上定量评估了所提出的文本定位算法的性能。实验结果表明,我们的算法达到了最新水平。还使用十个盲人收集的数据集对概念验证原型进行了评估,以评估系统硬件的有效性。我们探讨了用户界面问题,并评估了该算法在从具有复杂背景的不同对象中提取和读取文本时的鲁棒性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号