Semantic Text Encoding for Text Classification Using Convolutional Neural Networks

机译：使用卷积神经网络进行文本分类的语义文本编码

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we encode semantics of a text document in an image to take advantage of the same Convolutional Neural Networks (CNNs) that have been successfully employed to image classification. We use Word2Vec, which is an estimation of word representation in a vector space that can maintain the semantic and syntactic relationships among words. Word2Vec vectors are transformed into graphical words representing sequence of words in the text document. The encoded images are classified by using the AlexNet architecture. We introduced a new dataset named Text-Ferramenta gathered from an Italian price comparison website and we evaluated the encoding scheme through this dataset along with two publicly available datasets i.e. 20news-bydate and StackOverflow. Our scheme outperforms the text classification approach based on Doc2Vec and Support Vector Machine (SVM) when all the words of a text document can be completely encoded in an image. We believe that the results on these datasets are an interesting starting point for many Natural Language Processing works based on CNNs, such as a multimodal approach that could use a single CNN to classify both image and text information.

机译：在本文中，我们利用图像中文本文档的语义进行编码，以利用已成功用于图像分类的相同卷积神经网络（CNN）。我们使用Word2Vec，它是向量空间中单词表示的一种估计，可以保持单词之间的语义和句法关系。 Word2Vec向量被转换为表示文本文档中单词序列的图形单词。编码的图像通过使用AlexNet架构进行分类。我们引入了一个从意大利价格比较网站收集的名为Text-Ferramenta的新数据集，并通过该数据集以及两个可公开获取的数据集（即20news-bydate和StackOverflow）对编码方案进行了评估。当文本文档的所有单词都可以在图像中完全编码时，我们的方案优于基于Doc2Vec和支持向量机（SVM）的文本分类方法。我们认为，这些数据集上的结果是许多基于CNN的自然语言处理工作的有趣起点，例如可以使用单个CNN对图像和文本信息进行分类的多模式方法。

著录项

来源
《IAPR International Conference on Document Analysis and Recognition》|2017年|16-21|共6页
会议地点
作者
Ignazio Gallo; Shah Nawaz; Alessandro Calefati;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Encoding; Visualization; Semantics; Kernel; Image coding; Support vector machines; Convolutional neural networks;

机译：编码;可视化;语义;内核;图像编码;支持向量机;卷积神经网络;

相似文献

外文文献
中文文献
专利

1. Task-generic semantic convolutional neural network for web text-aided image classification [J] . Wang Dongzhe, Mao Kezhi Neurocomputing . 2019,第FEBa15期

机译：基于任务的语义卷积神经网络的Web文本辅助图像分类
2. Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification [J] . Wang Peng, Xu Bo, Xu Jiaming, Neurocomputing . 2016,第JANa22PTaB期

机译：使用词嵌入聚类和卷积神经网络进行语义扩展以改善短文本分类
3. Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification [J] . Banerjee Imon, Ling Yuan, Chen Matthew C., Artificial intelligence in medicine . 2019,第JUNa期

机译：卷积神经网络（CNN）和递归神经网络（RNN）架构在放射学文本报告分类中的比较有效性
4. Semantic Text Encoding for Text Classification Using Convolutional Neural Networks [C] . Ignazio Gallo, Shah Nawaz, Alessandro Calefati IAPR International Conference on Document Analysis and Recognition . 2017

机译：使用卷积神经网络进行文本分类的语义文本
5. Deep Neural Language Model for Text Classification Based on Convolutional and Recurrent Neural Networks [D] . Hassan, Abdalraouf. 2018

机译：基于卷积神经网络和递归神经网络的深度神经语言文本分类模型
6. Clinical text classification with rule-based features and knowledge-guided convolutional neural networks [O] . Liang Yao, Chengsheng Mao, Yuan Luo 2019

机译：具有基于规则的功能和知识导向的卷积神经网络的临床文本分类
7. Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification [O] . Imon Banerjee, Yuan Ling, Matthew C. Chen, 2019

机译：卷积神经网络（CNN）和反复性神经网络（RNN）架构对放射学文本报告分类的比较有效性

Semantic Text Encoding for Text Classification Using Convolutional Neural Networks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅