Layered Representation of Bengali Texts in Reduced Dimension Using Deep Feedforward Neural Network for Categorization

机译：使用深馈神经网络进行分类，孟加拉文本的分层表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic text categorization is a primary step in information retrieval where it is necessary to find the most relevant documents in an enormous volume. It is also useful in a wide range of web domains, such as from portal sites to news indexing, or from spam filtering to genre tagging. A significant amount of research works has been carried out in this field, and they are mostly dominated by Support Vector Machines (SVMs) models. Although these models have been very successful, but they require careful feature engineering to achieve optimum results. In this paper, we propose a model for Bengali text categorization that doesn't require feature engineering and is able to capture nonlinearity in data. We had first found a lower dimensional representation for the tf-idf vectors of each document using denoising autoencoders, and then we fed this transformed domain data vector into a deep feedforward network to find its most plausible category. We also show empirically that our model achieves 94.05 % accuracy for 12 categories that surmounts the best existing models on Bengali text categorization.

机译：自动文本分类是信息检索中的主要步骤，其中有必要以巨大的卷查找最相关的文档。它在各种网络域中也是有用的，例如从门户网站到新闻索引，或从垃圾邮件过滤到类型标记。在该领域进行了大量的研究工作，它们主要由支持向量机（SVM）模型主导。虽然这些模型一直非常成功，但他们需要仔细的特色工程来实现最佳结果。在本文中，我们提出了一种不需要特征工程的孟加拉文本分类的模型，并且能够捕获数据中的非线性。我们首先使用Denoising AutoEncoders找到每个文档的TF-IDF向量的较低维度表示，然后我们将此变换的域数据向量送入深馈通网络以找到其最合理的类别。我们还经验展示了我们的模型实现了124.05 ％的准确性，为12个类别施加了孟加拉文本分类的最佳现有模型。

著录项

来源
《International Conference of Computer and Information Technology》|2018年|422 p. :|共5页
会议地点
作者
Bishwajit Purkaystha; Tapos Datta; Md. Saiful Islam; Marium-E-Jannat;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Biological system modeling; Noise reduction; Analytical models; Mathematical model; Text categorization; Support vector machines; Feature extraction;

机译：生物系统建模;降噪;分析模型;数学模型;文本分类;支持向量机;特征提取;

相似文献

外文文献
中文文献
专利

1. A Novel Text Representation Model to Categorize Text Documents using Convolution Neural Network [J] . M. B. Revanasiddappa, B. S. Harish International Journal of Intelligent Systems and Applications . 2019,第5期

机译：利用卷积神经网络对文本文档进行分类的新型文本表示模型
2. A new deep neural network based on a stack of single-hidden-layer feedforward neural networks with randomly fixed hidden neurons [J] . Hu Junying, Zhang Jiangshe, Zhang Chunxia, Neurocomputing . 2016,第JANa1期

机译：一种新的深度神经网络，该网络基于具有随机固定的隐藏神经元的单层前馈神经网络的堆栈
3. Incremental text categorization based on hybrid optimization-based deep belief neural network [J] . Srilakshmi V, Anuradha K., Bindu C. Shoba Journal of High Speed Networks . 2021,第2期

机译：基于混合优化的深度信仰神经网络的增量文本分类
4. Layered Representation of Bengali Texts in Reduced Dimension Using Deep Feedforward Neural Network for Categorization [C] . Bishwajit Purkaystha, Tapos Datta, Md. Saiful Islam, International Conference of Computer and Information Technology . 2018

机译：使用深度前馈神经网络对孟加拉语文本进行降维分层表示
5. Neural network control and an optoelectronic implementation of a multilayer feedforward neural network [D] . Yamamura, Alan Akihiro. 1992

机译：多层前馈神经网络的神经网络控制和光电实现
6. Correspondence between Monkey Visual Cortices and Layers of a Saliency Map Model Based on a Deep Convolutional Neural Network for Representations of Natural Images [O] . Nobuhiko Wagatsuma, Akinori Hidaka, Hiroshi Tamura 2021

机译：基于深度卷积神经网络的猴子视觉皮质与显着图模型层的对应关系用于自然图像的表示
7. Distinction of The Authors of Texts Using Multilayered Feedforward Neural Networks [O] . Suvad Selman 2012

机译：使用多层馈通神经网络的文本作者的区别
8. Multi-Layered Feedforward Neural Networks for Image Segmentation [R] . Tarr, G. L. 1991

机译：用于图像分割的多层前馈神经网络

Layered Representation of Bengali Texts in Reduced Dimension Using Deep Feedforward Neural Network for Categorization

摘要

著录项

相似文献

相关主题

期刊订阅