【24h】

Arabic Dialect Identification for Travel and Twitter Text

机译:旅行和Twitter文本的阿拉伯方言标识

获取原文

摘要

This paper presents the results of the experiments done as a part of MADAR Shared Task in WAN LP 2019 on Arabic Fine-Grained Dialect Identification. Dialect Identification is one of the prominent tasks in the field of Natural language processing where the subsequent language modules can be improved based on it. We explored the use of different features like char, word n-gram, language model probabilities, etc on different classifiers. Results show that these features help to improve dialect classification accuracy. Results also show that traditional machine learning classifier tends to perform better when compared to neural network models on this task in a low resource setting.
机译:本文介绍了作为WAN LP 2019中的MADAR共享任务的一部分进行的实验结果,该任务涉及阿拉伯语细粒度方言识别。方言识别是自然语言处理领域的重要任务之一,在此基础上可以改进后续的语言模块。我们探索了在不同的分类器上使用不同特征(例如char,单词n-gram,语言模型概率等)的情况。结果表明,这些功能有助于提高方言分类的准确性。结果还表明,与神经网络模型相比,传统的机器学习分类器在资源匮乏的情况下往往表现更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号