首页> 外文期刊>Natural language engineering >Syntax-ignorant N-gram embeddings for dialectal Arabic sentiment analysis
【24h】

Syntax-ignorant N-gram embeddings for dialectal Arabic sentiment analysis

机译:语法 - 无知的N-GRAM嵌入用于方解字阿拉伯语情绪分析

获取原文
获取原文并翻译 | 示例
       

摘要

Arabic sentiment analysis models have recently employed compositional paragraph or sentence embedding features to represent the informal Arabic dialectal content. These embeddings are mostly composed via ordered, syntax-aware composition functions and learned within deep neural network architectures. With the differences in the syntactic structure and words' order among the Arabic dialects, a sentiment analysis system developed for one dialect might not be efficient for the others. Here we present syntax-ignorant, sentiment-specific n-gram embeddings for sentiment analysis of several Arabic dialects. The novelty of the proposed model is illustrated through its features and architecture. In the proposed model, the sentiment is expressed by embeddings, composed via the unordered additive composition function and learned within a shallow neural architecture. To evaluate the generated embeddings, they were compared with the state-of-the art word/paragraph embeddings. This involved investigating their efficiency, as expressive sentiment features, based on the visualisation maps constructed for our n-gram embeddings and word2vec/doc2vec. In addition, using several Eastern/Western Arabic datasets of single-dialect and multi-dialectal contents, the ability of our embeddings to recognise the sentiment was investigated against word/paragraph embeddings-based models. This comparison was performed within both shallow and deep neural network architectures and with two unordered composition functions employed. The results revealed that the introduced syntax-ignorant embeddings could represent single and combinations of different dialects efficiently, as our shallow sentiment analysis model, trained with the proposed n-gram embeddings, could outperform the word2vec/doc2vec models and rival deep neural architectures consuming, remarkably, less training time.
机译:阿拉伯语情绪分析模型最近采用了组成段落或句子嵌入功能,以代表非正式的阿拉伯语方言内容。这些嵌入式主要通过有序的语法感知组合函数组成,并在深度神经网络架构中学习。随着句法结构的差异和阿拉伯语方言中的单词顺序,为一个方言开发的情感分析系统可能对其他方言产生效率。在这里,我们呈现语法 - 无知,特定的特异性N-GRAG嵌入,用于多种阿拉伯语方言的情感分析。通过其特征和架构说明了所提出的模型的新颖性。在所提出的模型中,情绪由嵌入式表达,通过无序添加剂组合作用函数组成并在浅神经结构内学习。为了评估所生成的嵌入式,与嵌入式的单词/段落嵌入式进行比较。这涉及根据为我们的N-GRAM Embeddings和Word2Vec / Doc2VEC构建的可视化地图来调查其效率。此外,使用单方言和多方方法内容的几个东部/西方阿拉伯语数据集,对我们嵌入式识别情绪的能力进行了调查,针对基于词/段嵌入的模型进行了调查。这种比较在浅层和深度神经网络架构中进行,并且使用了两个无序的构图功能。结果表明,由于我们的浅文化分析模型,引入的语法 - 无知嵌入物可以代表单一的单一和组合不同方言的单一和组合,因为我们的浅情绪分析模型,用提议的n克嵌入式训练,可以优于Word2Vec / Doc2VEC模型和竞争对手的深神经结构消耗的竞争,值得注意的是,较少的培训时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号