Short Text Clustering via Convolutional Neural Networks

机译：通过卷积神经网络短信群集

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Short text clustering has become an increasing important task with the popularity of social media, and it is a challenging problem due to its sparseness of text representation. In this paper, we propose a Short Text Clustering via Convolutional neural networks (abbr. to STCC), which is more beneficial for clustering by considering one constraint on learned features through a self-taught learning framework without using any external tags/labels. First, we embed the original keyword features into compact binary codes with a locality-preserving constraint. Then, word embed-dings are explored and fed into convolutional neural networks to learn deep feature representations, with the output units fitting the pre-trained binary code in the training process. After obtaining the learned representations, we use K-means to cluster them. Our extensive experimental study on two public short text datasets shows that the deep feature representation learned by our approach can achieve a significantly better performance than some other existing features, such as term frequency-inverse document frequency, Laplacian eigenvectors and average embedding, for clustering.

机译：短文本聚类已成为社交媒体普及的越来越重要的任务，由于文本表示的疲劳，这是一个具有挑战性的问题。在本文中，我们通过卷积神经网络（ABBR。到STC）提出了简短的文本聚类，这对于通过在不使用任何外部标记/标签的情况下考虑通过自学学习框架的一个约束来群集更有益。首先，我们将原始关键字功能嵌入到具有局部保留约束的紧凑二进制代码中。然后，探索单词嵌入点并进入卷积神经网络以学习深度特征表示，输出单元拟合训练过程中的预先训练的二进制代码。获取学习的表示后，我们使用k-means来培养它们。我们对两个公共短文数据集的广泛实验研究表明，我们的方法学到的深度特征表示可以实现比其他一些现有功能的性能显着更好，例如术语频率 - 逆文档频率，拉普拉斯特征向量和平均嵌入，用于聚类。

著录项

来源
《Workshop on vector space Modeling for Natural Language Processing》|2015年||共8页
会议地点
作者
Jiaming Xu; Peng Wang; Guanhua Tian; Bo Xu; Jun Zhao; Fangyuan Wang; Hongwei Hao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类线性空间理论（向量空间）;
关键词

相似文献

外文文献
中文文献
专利

1. Self-Taught convolutional neural networks for short text clustering [J] . Xu Jiaming, Wang Peng, Zheng Suncong, Neural Networks: The Official Journal of the International Neural Network Society . 2017,第期

机译：用于短文本聚类的自学卷积神经网络
2. Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification [J] . Wang Peng, Xu Bo, Xu Jiaming, Neurocomputing . 2016,第JANa22PTaB期

机译：使用词嵌入聚类和卷积神经网络进行语义扩展以改善短文本分类
3. Incorporating context-relevant concepts into convolutional neural networks for short text classification [J] . Neurocomputing . 2020,第Apra21期

机译：将上下文相关概念纳入卷积神经网络以进行短文本分类
4. Research on Chinese Short Text Clustering Ensemble via Convolutional Neural Networks [C] . Haowen Wan, Bo Ning, Xiaoyu Tao, International conference on artificial intelligence in China . 2020

机译：卷积神经网络中文短文本集群集群研究
5. Deep Neural Language Model for Text Classification Based on Convolutional and Recurrent Neural Networks [D] . Hassan, Abdalraouf. 2018

机译：基于卷积神经网络和递归神经网络的深度神经语言文本分类模型
6. Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks [O] . Mohammed Alawad, Shang Gao, John X Qiu, 2020

机译：使用Multitask卷积神经网络自动提取癌症注册表的癌症注册表可报告信息
7. Self-Taught Convolutional Neural Networks for Short Text Clustering [O] . Xu, Jiaming, Xu, Bo, Wang, Peng, 2016

机译：用于短文本聚类的自教卷积神经网络

Short Text Clustering via Convolutional Neural Networks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅