A Novel Document Classification Algorithm Based on Statistical Features and Attention Mechanism

机译：基于统计特征和注意机制的新型文件分类算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Bi-directional Long-Short term Memory in Deep Learning is often used to solve the problem of long-term dependency and gradient explosion. The combination of forward and backward sequences also includes more semantic information. However, the influence of keywords in documents is not clearly addressed. Attention mechanism has been successfully used in several start-of-the-art natural language processing related applications. In the field of text processing, the methods of calculating attention weights are typically at the word level. Although these methods improve the performance of a model, they increase the computational cost significantly. In this paper, we propose to calculate attention weights at the structured event level since 1) events contain richer semantics than words or phrases; and 2) event-based attention mechanism reduces computational cost. Different from the existing deep learning model which does not rely on the text statistical features, we add the statistical features on the basis of attention weight calculation. Compared with the existing models, the semantic information contained in the event structure and the corresponding statistical features improves the quality of the text vector representation and achieves better classification performance. Finally, we evaluate the performance of our model in terms of accuracy, recall and F-Score. The experimental results show that our model achieves better results while reducing the computational cost.

机译：深度学习中的双向长期短期记忆通常用于解决长期依赖性和梯度爆炸的问题。前向和后向序列的组合还包含更多的语义信息。但是，没有明确解决关键字在文档中的影响。注意机制已成功用于几种最先进的自然语言处理相关应用程序中。在文本处理领域，计算注意力权重的方法通常在单词级别。尽管这些方法改善了模型的性能，但它们显着增加了计算成本。在本文中，我们建议在结构化事件级别上计算注意力权重，因为1）事件包含比单词或短语更丰富的语义; 2）基于事件的注意力机制降低了计算成本。与现有的不依赖于文本统计功能的深度学习模型不同，我们在注意力权重计算的基础上添加了统计功能。与现有模型相比，事件结构中包含的语义信息和相应的统计特征提高了文本向量表示的质量，并实现了更好的分类性能。最后，我们根据准确性，召回率和F分数评估模型的性能。实验结果表明，该模型在降低计算成本的同时，取得了较好的效果。

著录项

来源
《International Joint Conference on Neural Networks》|2018年|1-6|共6页
会议地点
作者
Chao Li; Yanfen Cheng; Hongxia Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Semantics; Feature extraction; Machine learning; Classification algorithms; Computational modeling; Eigenvalues and eigenfunctions; Data mining;

机译：语义;特征提取;机器学习;分类算法;计算建模;特征值和特征函数;数据挖掘;

相似文献

外文文献
中文文献
专利

1. Music Feature Classification Based on Recurrent Neural Networks with Channel Attention Mechanism [J] . Jie Gan Mobile information systems . 2021,第a期

机译：基于经常性神经网络的频道注意机制的音乐特征分类
2. Chinese text classification based on attention mechanism and feature-enhanced fusion neural network [J] . Computing . 2020,第3期

机译：基于注意力机制和特征增强融合神经网络的中文文本分类
3. A Remote Sensing Land Cover Classification Algorithm Based on Attention Mechanism [J] . Xiaolu Zhang, Zhaoshun Wang, Lianyu Cao, Canadian Journal of Remote Sensing . 2021,第6期

机译：一种基于注意机制的遥感土地覆盖分类算法
4. A Novel Document Classification Algorithm Based on Statistical Features and Attention Mechanism [C] . Chao Li, Yanfen Cheng, Hongxia Wang International Joint Conference on Neural Networks . 2018

机译：一种基于统计特征和注意机制的新型文档分类算法
5. Perceptual and computational mechanisms of feature -based attention [D] . Lu, Jianwei. 2006

机译：基于特征的注意力的感知和计算机制
6. Lung nodule malignancy classification using only radiologist-quantified image features as inputs to statistical learning algorithms: probing the Lung Image Database Consortium dataset with two statistical learning methods [O] . Matthew C. Hancock, Jerry F. Magnan 2016

机译：仅使用放射科医生量化的图像特征作为统计学习算法的输入的肺结节恶性分类：使用两种统计学习方法探查肺图像数据库联盟数据集
7. Music Feature Classification Based on Recurrent Neural Networks with Channel Attention Mechanism [O] . Jie Gan 2021

机译：基于经常性神经网络的频道注意机制的音乐特征分类

A Novel Document Classification Algorithm Based on Statistical Features and Attention Mechanism

摘要

著录项

相似文献

相关主题

期刊订阅