Similarity Measures for Chinese Short Text Based on Representation Learning

Yan Li; Xucheng Yin; Yinghua Zhang; Hongwei Hao

首页> 外文期刊>Journal of information and computational science >Similarity Measures for Chinese Short Text Based on Representation Learning

【24h】

Similarity Measures for Chinese Short Text Based on Representation Learning

机译：基于表征学习的中文短文本相似性度量

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Similarity measure in Chinese short text is an important prerequisite for many content-based texts or documents retrieval tasks. In this paper, we propose a fast method for representing Chinese short texts to calculate the similarity between texts. The method is based on the representation of Chinese words. First, Chinese word representation is learned by a deep neural network with local context embedding and global context. Then, the words in short text are replaced by the learned representations of Chinese words and the short text is represented by dynamic average-weighted function depending on target text. Next, the cosine similarity method is used for the similarity measurement between texts. Last, experiment shows the semantic by visualizing the result of Chinese word representation learning and the experiment on similarity measure demonstrates the effectiveness of our short text representation method.

机译：中文短文本的相似性度量是许多基于内容的文本或文档检索任务的重要前提。在本文中，我们提出了一种用于表示中文短文本的快速方法，以计算文本之间的相似度。该方法基于中文单词的表示。首先，通过具有局部上下文嵌入和全局上下文的深度神经网络来学习中文单词表示。然后，将短文本中的单词替换为学习过的中文单词表示形式，并根据目标文本，通过动态平均加权函数来表示短文本。接下来，将余弦相似度方法用于文本之间的相似度测量。最后，实验通过可视化汉字表示学习的结果来显示语义，而相似性度量的实验证明了我们的短文本表示方法的有效性。

著录项

来源
《Journal of information and computational science》 |2015年第6期|2253-2263|共11页
作者
Yan Li; Xucheng Yin; Yinghua Zhang; Hongwei Hao;
展开▼
作者单位

University of Science and Technology Beijing, Beijing 100083, China;

University of Science and Technology Beijing, Beijing 100083, China;

Institute of Automation Chinese Academy of Sciences, Beijing 100190, China;

Institute of Automation Chinese Academy of Sciences, Beijing 100190, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Short Text; Representation Learning; Similarity Measure;

机译：短文字;表征学习;相似度;

相似文献

外文文献
中文文献
专利

1. Measuring the short text similarity based on semantic and syntactic information [J] . Jiaqi Yang, Yongjun Li, Congjie Gao, Future generation computer systems . 2021,第Jana期

机译：基于语义和句法信息测量短文本相似性
2. A Comparison of Approaches for Measuring the Semantic Similarity of Short Texts Based on Word Embeddings [J] . Karlo Babi?, Francesco Guerra, Sanda Martin?i?-Ip?i?, Journal of Information and Organizational Sciences . 2020,第2期

机译：基于Word Embeddings测量短文本语义相似性的方法的比较
3. SyMSS: A syntax-based measure for short-text semantic similarity [J] . Jesus Oliva, Jose Ignacio Serrano, Maria Dolores del Castillo, Data & Knowledge Engineering . 2011,第4期

机译：SyMSS：一种基于语法的短文本语义相似性度量
4. Similarity Measure for Polish Short Texts Based on Wordnet-Enhanced Bag-of-words Representation [C] . Maciej Piasecki, Anna Gut Language and Technology Conference . 2018

机译：基于Wordnet增强袋式表示的波兰短文本的相似性度量
5. Global Self-Similarity and Saliency Measures Based on Sparse Representations for Classification of Objects and Spatio-temporal Sequences. [D] . Somasundaram, Guruprasad. 2012

机译：基于稀疏表示的全局自相似性和显着性度量用于对象和时空序列的分类。
6. A Method of Short Text Representation Based on the Feature Probability Embedded Vector [O] . Wanting Zhou, Hanbin Wang, Hongguang Sun, 2019

机译：基于特征概率嵌入向量的短文本表示方法
7. FUSE (Fuzzy Similarity Measure) - A measure for determining fuzzy short text similarity using Interval Type-2 fuzzy sets [O] . Naeemeh Adel, Keeley Crockett, Alan Crispin, 2018

机译：熔断器（模糊相似度测量） - 使用间隔类型-2模糊集确定模糊短文本相似度的度量

Similarity Measures for Chinese Short Text Based on Representation Learning

摘要

著录项

相似文献

相关主题

期刊订阅