Unicode Aided Language Identification across Multiple Scripts and Heterogeneous Data

Farheen Hanif; Fouzia Latif; M. Sikandar Hayat Khiyal

首页> 外文期刊>Information Technology Journal >Unicode Aided Language Identification across Multiple Scripts and Heterogeneous Data

【24h】

Unicode Aided Language Identification across Multiple Scripts and Heterogeneous Data

机译：跨多个脚本和异构数据的Unicode辅助语言识别

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

With growing explosion of multi-lingual data on the Internet and other informational and communicational fields, the requirement of having effective automated language identifiers has increased further. More information finds its way into the computer systems and the web and using manual methods to categorize the information is becoming increasingly in-feasible. In this study we discuss improvements we have achieved in existing language identification methods. Couple of new areas that were not explored before is the inclusion of non-Roman scripts and active usage of Unicode information about scripts to enhance the language detection process.

机译：随着因特网以及其他信息和通信领域中多语言数据的爆炸式增长，具有有效的自动语言标识符的需求进一步增加。更多信息进入计算机系统和Web，并且使用手动方法对信息进行分类变得越来越不可行。在这项研究中，我们讨论了在现有语言识别方法上已经取得的进步。以前未探索的几个新领域是包含非罗马脚本以及积极使用有关脚本的Unicode信息以增强语言检测过程。

著录项

来源
《Information Technology Journal》 |2007年第4期|共7页
作者
Farheen Hanif; Fouzia Latif; M. Sikandar Hayat Khiyal;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Writing type, script and language identification in heterogeneous documents [J] . Anis Mezghani, Fouad Slimane, Monji Kherallah International Journal of Intelligent Systems Technologies and Applications . 2017,第3期

机译：在异构文件中写入类型，脚本和语言识别
2. Non-data-aided joint bit-rate and modulation format identification for next-generation heterogeneous optical networks [J] . Faisal Nadeem Khan, Yudi Zhou, Qi Sui, Optical fiber technology . 2014,第2期

机译：下一代异构光网络的无数据辅助联合比特率和调制格式识别
3. Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature Fusion [J] . Chenggang Mi, Shaolin Zhu, Rui Nie Computational intelligence and neuroscience . 2021,第a期

机译：利用数据增强和多个特征融合，在低资源语言中提高笔记识别
4. A Data mining approach for resolving cases of Multiple Parsing in Machine Aided Translation of Indian Languages [C] . S. D. Samantaray International Conference on Information Technology . 2007

机译：一种解决印度语言的机器辅助翻译中多次解析案例的数据挖掘方法
5. Benchmarking scripting languages, Microsoft .NET, and databases with a focus on text mining performance. [D] . Chadwick, Stephen C. 2007

机译：对脚本语言，Microsoft .NET和数据库进行基准测试，重点是文本挖掘性能。
6. Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature Fusion [O] . Chenggang Mi, Shaolin Zhu, Rui Nie 2021

机译：利用数据增强和多个特征融合在低资源语言中提高笔记识别
7. Non-data-aided joint bit-rate and modulation format identification for next-generation heterogeneous optical networks [O] . Khan FN, Zhou Y, Sui Q, 2014

机译：下一代异构光网络的无数据辅助联合比特率和调制格式识别

Unicode Aided Language Identification across Multiple Scripts and Heterogeneous Data

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅