首页> 外文期刊>Knowledge-Based Systems >Document-level multi-topic sentiment classification of Email data with BiLSTM and data augmentation
【24h】

Document-level multi-topic sentiment classification of Email data with BiLSTM and data augmentation

机译:文档级多主题情绪与bilstm和数据增强的电子邮件数据分类

获取原文
获取原文并翻译 | 示例

摘要

Email data has unique characteristics, involving multiple topics, lengthy replies, formal language, high variance in length, high duplication, anomalies, and indirect relationships that distinguish it from other social media data. In order to better model Email documents and to capture complex sentiment structures in the content, we develop a framework for document-level multi-topic sentiment classification of Email data. Note that, a large volume of labeled Email data is rarely publicly available. We introduce an optional data augmentation process to increase the size of datasets with synthetically labeled data to reduce the probability of overfitting and underfitting during the training process. To generate segments with topic embeddings and topic weighting vectors as inputs for our proposed model, we apply both latent Dirichlet allocation topic modeling and semantic text segmentation to post-process Email documents. Empirical results obtained with multiple sets of experiments, including performance comparison against various state-of-the-art algorithms with and without data augmentation and diverse parameter settings, are analyzed to demonstrate the effectiveness of our proposed framework. (C) 2020 Elsevier B.V. All rights reserved.
机译:电子邮件数据具有独特的特征,涉及多个主题,冗长的回复,正式语言,长度的高度,高复制,异常和间接关系,将其与其他社交媒体数据区分开来。为了更好地模型电子邮件文件并捕获内容中的复杂情感结构,我们开发了一个文档级多主题情绪的框架,用于电子邮件数据的分类。请注意,大量标记的电子邮件数据很少可公开可用。我们介绍了可选的数据增强过程,以增加具有合成标记数据的数据集的大小,以减少培训过程中过度装备和施用的概率。要生成主题嵌入式和主题加权向量的段作为我们所提出的模型的输入,我们将潜在的Dirichlet分配主题建模和语义文本分段应用于流程后的电子邮件文档。通过多组实验获得的经验结果,包括对具有和不具有数据增强和不同参数设置的各种最先进的算法的性能比较,以展示我们提出框架的有效性。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号