A Deep Learning Approach for the Romanized Tunisian Dialect Identification

Younes Jihene; Achour Hadhemi; Souissi Emna; Ferchichi Ahmed

首页> 外文期刊>The international arab journal of information technology >A Deep Learning Approach for the Romanized Tunisian Dialect Identification

【24h】

A Deep Learning Approach for the Romanized Tunisian Dialect Identification

机译：罗马化突尼斯方言识别的深度学习方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Language identification is an important task in natural language processing that consists of determining the language of a given text. It has increasingly picked the interest of researchers for the past few years, especially for code-switching informal textual content. This paper, focuses on the identification of the Romanized user-generated Tunisian dialect on the social web. Segmented and annotated a corpus extracted from social media and propose a deep learning approach for the identification task. A Bidirectional Long Short-Term Memory neural network with Conditional Random Fields decoding (BLSTM-CRF) had been used. For word embeddings, a combination of word-character BLSTM vector representation and Fast Text embeddings that takes into consideration character n-gram features. The overall accuracy obtained is 98.65%.

机译：语言识别是自然语言处理中的重要任务，包括确定给定文本的语言。它越来越多地利用了过去几年研究人员的兴趣，特别是对于代码切换非正式文本内容。本文侧重于识别社交网络上的罗马化用户生成的突尼斯方言。分段和注释从社交媒体提取的语料库，并提出了一种识别任务的深入学习方法。已经使用了具有条件随机字段解码（BLSTM-CRF）的双向长短期内存神经网络。对于Word Embeddings，Word-Character BLSTM矢量表示和快速文本嵌入的组合，用于考虑字符N-GRAM功能。所获得的总体准确性为98.65％。

著录项

来源
《The international arab journal of information technology》 |2020年第6期|935-946|共12页
作者
Younes Jihene; Achour Hadhemi; Souissi Emna; Ferchichi Ahmed;
展开▼
作者单位

Univ Tunis ISGT Tunis Tunisia;

Univ Tunis ISGT Teaching Comp Sci Tunis Tunisia;

Univ Tunis ENSIT Comp Sci Tunis Tunisia;

Univ Tunis ISGT Tunis Tunisia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Tunisian dialect; language identification; deep learning; BLSTM; CRF; natural language processing;

机译：突尼斯方言;语言识别;深入学习;BLSTM;CRF;自然语言处理;
入库时间 2022-08-18 23:27:13

相似文献

外文文献
中文文献
专利

1. Gender identification for Egyptian Arabic dialect in twitter using deep learning models [J] . Shereen ElSayed, Mona Farouk Egyptian Informatics Journal . 2020,第3期

机译：埃及阿拉伯语方言的性别识别使用深度学习模型
2. A Sequence-to-Sequence based Approach For the double Transliteration of Tunisian Dialect [J] . Jihene Younes, Emna Souissi, Hadhemi Achour, Procedia Computer Science . 2018,第1期

机译：基于序列到序列的突尼斯方言双重音译方法
3. Joint identification-verification for person re-identification: A four stream deep learning approach with improved quartet loss function [J] . Amena Khatun, Simon Denman, Sridha Sridharan, Computer vision and image understanding . 2020,第Auga期

机译：对人的联合识别验证重新识别：一种四流深入学习方法，具有改进的四重奏损失功能
4. Word-Level Identification of Romanized Tunisian Dialect [C] . Chaima Aridhi, Hadhemi Achour, Emna Souissi, International conference on applications of natural language to information systems . 2017

机译：突尼斯方言罗马字的字级识别
5. Morphological Tagging and Disambiguation in Dialectal Arabic Using Deep Learning Architectures [D] . Zalmout, Nasser . 2020

机译：使用深度学习架构的语言阿拉伯语中的形态标记和歧义
6. A Fault Prediction and Cause Identification Approach in Complex Industrial Processes Based on Deep Learning [O] . Yao Li 2021

机译：基于深度学习的复杂工业过程中的故障预测与识别方法
7. Hierarchical Deep Learning for Arabic Dialect Identification [O] . Gael de Francony, Victor Guichard, Praveen Joshi, 2019

机译：阿拉伯语方言识别的分层深度学习

A Deep Learning Approach for the Romanized Tunisian Dialect Identification

摘要

著录项

相似文献

相关主题

期刊订阅