首页> 外文会议> >Techniques for language identification for hybrid Arabic-English document images

【24h】

Techniques for language identification for hybrid Arabic-English document images

机译：阿拉伯－英语混合图像图像的语言识别技术

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Because of the different characteristics of Arabic language and Romance and Anglo Saxon languages, recognition of documents written in hybrids of these languages requires that the language of the text is to be identified prior to the recognition phase. In this paper, three efficient techniques that can be used to discriminate between text written in Arabic script and text written in English script are presented and evaluated. These techniques address the language identification problem on the word level and on text level. The characteristics of horizontal projection profiles as well as runlength histograms for text written in both languages are the basic features underlying these techniques. Solving this problem is very important in building bilingual document image analysis systems which are capable of processing documents containing hybrid Arabic/Romance and Anglo Saxon languages.

机译：由于阿拉伯语言以及罗曼语和盎格鲁撒克逊语言的不同特征，以这些语言的混合语言编写的文档的识别要求在识别阶段之前识别文本的语言。在本文中，提出并评估了三种有效的技术，可以用来区分以阿拉伯文字书写的文字和以英文文字书写的文字。这些技术在单词级别和文本级别解决了语言识别问题。这些技术的基本特征是水平投影轮廓的特征以及用两种语言编写的文本的游程直方图。解决此问题对于构建双语文档图像分析系统非常重要，该系统能够处理包含阿拉伯语/罗曼斯语和盎格鲁撒克逊语混合语言的文档。

著录项

来源
《》|2001年|P.1100-1104|共5页
会议地点
作者
Elgammal; A.M.; Ismail; M.A.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. CROSS-LANGUAGE PLAGIARISM OF ARABIC-ENGLISH DOCUMENTS USING LINEAR LOGISTIC REGRESSION [J] . ZAID ALAA, SABRINA TIUN, MOHAMMEDHASAN ABDULAMEER Journal of Theoretical and Applied Information Technology . 2016,第1期

机译：基于线性逻辑回归的阿拉伯英语文档跨语言抄袭
2. Language Identification in Document Images [J] . Barlas P., Hebert D., Chatelain C., Journal of Imaging Science and Technology . 2016,第1期

机译：文档图像中的语言识别
3. Language identification for handwritten document images using a shape codebook [J] . Zhu GY, Yu XD, Li Y, Pattern Recognition: The Journal of the Pattern Recognition Society . 2009,第12期

机译：使用形状码本识别手写文档图像的语言
4. Natural Language Processing Techniques for Document Classification in IT Benchmarking Automated Identification of Domain Specific Terms [C] . Matthias Pfaff, Helmut Krcmar International Conference on Enterprise Information Systems . 2015

机译：自然语言处理文档分类的技术在其基准测试自动识别域特定术语
5. Document image analysis techniques for handwritten text segmentation, document image rectification and digital collation. [D] . Salvi, Dhaval. 2014

机译：用于手写文本分割，文档图像校正和数字整理的文档图像分析技术。
6. Prediction of COVID-19 with Computed Tomography Images using Hybrid Learning Techniques [O] . Varalakshmi Perumal, Vasumathi Narayanan, Sakthi Jaya Sundar Rajasekar 2021

机译：使用混合学习技术预测Covid-19与计算机断层扫描图像的预测
7. Techniques for language identification for hybrid arabic-english document images [O] . Ahmed M. Elgammal 2001

机译：阿拉伯－英语混合文档图像的语言识别技术
8. Interframe Coding of Digital Images Using Transform and Hybrid Transform/Predictive Techniques. [R] . roese,john a. 1976

机译：基于变换和混合变换/预测技术的数字图像帧间编码。

Techniques for language identification for hybrid Arabic-English document images

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅