A New Vector Representation of Short Texts for Classification

Li Yangyang; Liu Bo

首页> 外文期刊>The international arab journal of information technology >A New Vector Representation of Short Texts for Classification

【24h】

A New Vector Representation of Short Texts for Classification

机译：分类短文本的新矢量表示

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Short and sparse characteristics and synonyms and homonyms are main obstacles for short-text classification. In recent years, research on short-text classification has focused on expanding short texts but has barely guaranteed the validity of expanded words. This study proposes a new method to weaken these effects without external knowledge. The proposed method analyses short texts by using the topic model based on Latent Dirichlet Allocation (LDA), represents each short text by using a vector space model and presents a new method to adjust the vector of short texts. In the experiments, two open short-text data sets composed of google news and web search snippets are utilised to evaluate the classification performance and prove the effectiveness of our method.

机译：短期和稀疏特征和同义词和同音异义词是短文本分类的主要障碍。近年来，对短文本分类的研究专注于扩大短文，但几乎没有保证扩展词的有效性。本研究提出了一种在没有外部知识的情况下削弱这些影响的新方法。该方法通过使用基于潜在Dirichlet分配（LDA）的主题模型来分析短文本，表示每个短文本，通过使用矢量空间模型，并提出一种调整短文本向量的新方法。在实验中，使用由Google News和Web搜索片段组成的两个开放的短文本数据集来评估分类性能并证明我们方法的有效性。

著录项

来源
《The international arab journal of information technology》 |2020年第2期|241-249|共9页
作者
Li Yangyang; Liu Bo;
展开▼
作者单位

Jinan Univ Coll Informat Sci & Technol Jinan Peoples R China;

Jinan Univ Coll Informat Sci & Technol Jinan Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Text representation; short-text classification; Latent Dirichlet Allocation; topic model;

机译：文本表示;短文本分类;潜在的Dirichlet分配;主题模型;
入库时间 2022-08-18 23:33:21

相似文献

外文文献
中文文献
专利

1. Improving short text classification by learning vector representations of both words and hidden topics [J] . Zhang Heng, Zhong Guoqiang Knowledge-Based Systems . 2016,第juna15期

机译：通过学习单词和隐藏主题的向量表示来改善短文本分类
2. Joint Representations of Texts and Labels with Compositional Loss for Short Text Classification [J] . Hao Ming, Wang Weijing, Zhou Fang Journal of web engineering . 2021,第3期

机译：文本损失的文本和标签的联合陈述，短文本分类
3. Text Classification using Bi-Gram Alphabet Document Vector Representation [J] . Fatma Elghannam International Journal of Computer Trends and Technology . 2018,第2期

机译：使用Bi-Gram字母表文档矢量表示法进行文本分类
4. A Semi-Supervised Short Text Classification Method Based on Weighted Word Vector Representation [C] . Zhiming Zhang, Jie Luo, Geyu Huang 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication . 2019

机译：基于加权词向量表示的半监督短文本分类方法
5. IMPROVING CONCEPT REPRESENTATIONS FOR SHORT TEXT CLASSIFICATION [D] . Tao Sijie 2020

机译：改进用于短文本分类的概念表示
6. A Method of Short Text Representation Based on the Feature Probability Embedded Vector [O] . Wanting Zhou, Hanbin Wang, Hongguang Sun, 2019

机译：基于特征概率嵌入向量的短文本表示方法
7. Polyseme-Aware Vector Representation for Text Classification [O] . Shun Guo, Nianmin Yao 2020

机译：多膜意识到文本分类的矢量表示表示

A New Vector Representation of Short Texts for Classification

摘要

著录项

相似文献

相关主题

期刊订阅