A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects

机译：区分相似语言和方言的字符级卷积神经网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Discriminating between closely-related language varieties is considered a challenging and important task. This paper describes our submission to the DSL 2016 shared-task, which included two sub-tasks: one on discriminating similar languages and one on identifying Arabic dialects. We developed a character-level neural network for this task. Given a sequence of characters, our model embeds each character in vector space, runs the sequence through multiple convolutions with different filter widths, and pools the convolutional representations to obtain a hidden vector representation of the text that is used for predicting the language or dialect. We primarily focused on the Arabic dialect identification task and obtained an F1 score of 0.4834, ranking 6th out of 18 participants. We also analyze errors made by our system on the Arabic data in some detail, and point to challenges such an approach is faced with.~1

机译：区分密切相关的语言变体被认为是一项艰巨而重要的任务。本文介绍了我们提交给DSL 2016共享任务的过程，该任务包括两个子任务：一个任务是区分相似的语言，另一个任务是识别阿拉伯语。我们为此任务开发了字符级神经网络。给定一个字符序列，我们的模型将每个字符嵌入向量空间中，通过具有不同过滤器宽度的多个卷积运行该序列，并合并卷积表示，以获得用于预测语言或方言的文本的隐藏向量表示。我们主要专注于阿拉伯语方言识别任务，获得的F1分数为0.4834，在18位参与者中排名第六。我们还将详细分析系统在阿拉伯数据上产生的错误，并指出这种方法所面临的挑战。〜1

著录项

来源
《Workshop on NLP for similar languages, varieties and dialects》|2016年|145-152|共8页
会议地点
作者
Yonatan Belinkov; James Glass;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Enhanced character-level deep convolutional neural networks for cardiovascular disease prediction [J] . Zhichang Zhang, Yanlong Qiu, Xiaoli Yang, BMC Medical Informatics and Decision Making . 2020,第3期

机译：增强的性格级深卷积神经网络用于心血管疾病预测
2. Cyberbullying detection in social media text based on character-level convolutional neural network with shortcuts [J] . Lu Nijia, Wu Guohua, Zhang Zhen, Concurrency, practice and experience . 2020,第23期

机译：基于快捷方式的字符级卷积神经网络的社交媒体文本中的网络欺凌检测
3. Character-level text classification via convolutional neural network and gated recurrent unit [J] . Liu Bing, Zhou Yong, Sun Wei International journal of machine learning and cybernetics . 2020,第8期

机译：通过卷积神经网络和门控复发单元进行字符级文本分类
4. A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects [C] . Yonatan Belinkov, James Glass Workshop on NLP for similar languages, varieties and dialects . 2016

机译：用于区分类似语言和方言的字符级卷积神经网络
5. Deep Neural Language Model for Text Classification Based on Convolutional and Recurrent Neural Networks [D] . Hassan, Abdalraouf. 2018

机译：基于卷积神经网络和递归神经网络的深度神经语言文本分类模型
6. Enhanced character-level deep convolutional neural networks for cardiovascular disease prediction [O] . Zhichang Zhang, Yanlong Qiu, Xiaoli Yang, 2020

机译：增强的字符级深度卷积神经网络可预测心血管疾病
7. Enhanced character-level deep convolutional neural networks for cardiovascular disease prediction [O] . Zhichang Zhang, Yanlong Qiu, Xiaoli Yang, 2020

机译：增强的性格级深卷积神经网络用于心血管疾病预测
8. Application of Convolutional Neural Networks to Language Identification in Noisy Conditions [R] . Lei, Y, Ferrer, L, Lawson, A, 2014

机译：卷积神经网络在噪声条件下语言识别中的应用

A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects

摘要

著录项

相似文献

相关主题

期刊订阅