首页> 外文期刊>Procedia Computer Science >Developing Resources For Sentiment Analysis Of Informal Arabic Text In Social Media
【24h】

Developing Resources For Sentiment Analysis Of Informal Arabic Text In Social Media

机译:社交媒体非正式阿拉伯文文本的发展资源

获取原文
           

摘要

Natural Language Processing (NLP) applications such as text categorization, machine translation, sentiment analysis, etc., need annotated corpora and lexicons to check quality and performance. This paper describes the development of resources for sentiment analysis specifically for Arabic text in social media. A distinctive feature of the corpora and lexicons developed are that they are determined from informal Arabic that does not conform to grammatical or spelling standards. We refer to Arabic social media content of this sort as Dialectal Arabic (DA) - informal Arabic originating from and potentially mixing a range of different individual dialects. The paper describes the process adopted for developing corpora and sentiment lexicons for sentiment analysis within different social media and their resulting characteristics. The addition to providing useful NLP data sets for Dialectal Arabic the work also contributes to understanding the approach to developing corpora and lexicons.
机译:自然语言处理(NLP)诸如文本分类,机器翻译,情绪分析等的应用,需要注释的语料库和词汇来检查质量和性能。本文介绍了在社交媒体中专门为阿拉伯文文本进行情感分析的资源的发展。 Corpora和Lexicons的独特特征是他们的创造性地从非正式的阿拉伯语确定,不符合语法或拼写标准。我们将这种排序的阿拉伯社交媒体内容称为言语阿拉伯语(DA) - 源自和潜在地混合一系列不同的个别方言的非正式阿拉伯语。本文介绍了在不同社交媒体中开发语料库和情绪词典中采用的过程及其产生的特征。为Distectal Arabic提供有用的NLP数据集的补充也有助于了解开发语料库和词汇的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号