A survey on the techniques, applications, and performance of short text semantic similarity

Han Mengting; Zhang Xuan; Yuan Xin; Jiang Jiahao; Yun Wei; Gao Chen

首页> 外文期刊>Concurrency and computation: practice and experience >A survey on the techniques, applications, and performance of short text semantic similarity

【24h】

A survey on the techniques, applications, and performance of short text semantic similarity

机译：短文本语义相似性的技术，应用和性能调查

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Short text similarity plays an important role in natural language processing (NLP). It has been applied in many fields. Due to the lack of sufficient context in the short text, it is difficult to measure the similarity. The use of semantics similarity to calculate textual similarity has attracted the attention of academia and industry and achieved better results. In this survey, we have conducted a comprehensive and systematic analysis of semantic similarity. We first propose three categories of semantic similarity: corpus-based, knowledge-based, and deep learning (DL)-based. We analyze the pros and cons of representative and novel algorithms in each category. Our analysis also includes the applications of these similarity measurement methods in other areas of NLP. We then evaluate state-of-the-art DL methods on four common datasets, which proved that DL-based can better solve the challenges of the short text similarity, such as sparsity and complexity. Especially, bidirectional encoder representations from transformer model can fully employ scarce information of short texts and semantic information and obtain higher accuracy and F1 value. We finally put forward some future directions.

机译：短文本相似性在自然语言处理中发挥着重要作用（NLP）。它已应用于许多领域。由于短文本中缺乏足够的背景，难以衡量相似性。使用语义相似性来计算文本相似性引起了学术界和行业的关注，并取得了更好的结果。在本调查中，我们对语义相似性进行了全面和系统的分析。我们首先提出了三类语义相似性：基于语料库，基于知识和深度学习（DL）。我们分析了每个类别中代表和小说算法的优缺点。我们的分析还包括这些相似性测量方法在NLP的其他领域的应用。然后，我们在四个常见数据集上评估最先进的DL方法，这证明了基于DL的可以更好地解决短文本相似性的挑战，例如稀疏性和复杂性。特别地，来自变压器模型的双向编码器表示可以完全采用短文本和语义信息的稀缺信息，并获得更高的精度和F1值。我们终于提出了一些未来的指示。

著录项

来源
《Concurrency and computation: practice and experience》 |2021年第5期|e5971.1-e5971.17|共17页
作者
Han Mengting; Zhang Xuan; Yuan Xin; Jiang Jiahao; Yun Wei; Gao Chen;
展开▼
作者单位

Yunnan Univ Sch Software Kunming 650091 Yunnan Peoples R China;

Yunnan Univ Sch Software Kunming 650091 Yunnan Peoples R China|Key Lab Software Engn Yunnan Prov Kunming Yunnan Peoples R China;

Yunnan Univ Sch Software Kunming 650091 Yunnan Peoples R China;

Yunnan Univ Sch Software Kunming 650091 Yunnan Peoples R China;

Yunnan Univ Sch Software Kunming 650091 Yunnan Peoples R China;

Yunnan Univ Sch Software Kunming 650091 Yunnan Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
BERT; deep learning; semantic similarity; short text;

机译：伯特;深入学习;语义相似;短文本;

相似文献

外文文献
中文文献
专利

1. Exploring the feasibility and accuracy of Latent Semantic Analysis based text mining techniques to detect similarity between patent documents and scientific publications [J] . Tom Magerman, Bart Van Looy, Xiaoyan Song Scientometrics . 2010,第2期

机译：探索基于潜在语义分析的文本挖掘技术的可行性和准确性，以检测专利文献与科学出版物之间的相似性
2. Measuring the short text similarity based on semantic and syntactic information [J] . Jiaqi Yang, Yongjun Li, Congjie Gao, Future generation computer systems . 2021,第Jana期

机译：基于语义和句法信息测量短文本相似性
3. A Comparison of Approaches for Measuring the Semantic Similarity of Short Texts Based on Word Embeddings [J] . Karlo Babi?, Francesco Guerra, Sanda Martin?i?-Ip?i?, Journal of Information and Organizational Sciences . 2020,第2期

机译：基于Word Embeddings测量短文本语义相似性的方法的比较
4. Measuring Semantic Similarity in Short Texts through Greedy Pairing and Word Semantics [C] . Mihai Lintean, Vasile Rus International Florida Artificial Intelligence Research Society Conference . 2012

机译：通过贪婪配对和单词语义测量短文本中的语义相似性
5. Short-Text Semantic Similarity: Algorithms and Applications. [D] . Sultan, Md Arafat. 2016

机译：短文本语义相似性：算法和应用。
6. A Survey on Optimal Signal Processing Techniques Applied to Improve the Performance of Mechanical Sensors in Automotive Applications [O] . Wilmar Hernandez 2007

机译：用于改善汽车应用中机械传感器性能的最佳信号处理技术的调查
7. Short Text Topic Modeling Techniques, Applications, and Performance: A Survey [O] . Jipeng Qiang, Zhenyu Qian, Yun Li, 2020

机译：短文本主题建模技术，应用程序和性能：调查

A survey on the techniques, applications, and performance of short text semantic similarity

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅