首页> 外文会议>Web technologies and applications >Boosting Explicit Semantic Analysis by Clustering Paragraph Vectors of Wikipedia Articles

【24h】

Boosting Explicit Semantic Analysis by Clustering Paragraph Vectors of Wikipedia Articles

机译：通过对维基百科文章的段落向量进行聚类来促进显式语义分析

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Explicit Semantic Analysis (ESA) is an effective method that utilizes Wikipedia entries (articles) to represent text and compute semantic relatedness (SR) for text pairs. Analogous to ordinary web search techniques, ESA also suffers from the redundancy issues due to the ongoing expansion of the amount of Wikipedia entries. Entries redundancy could lead to biased representation that lay particular emphasis on semantics from a large number of similar entries. On the other hand, original ESA for SR has a weak point that it does not consider the correlations or similarities between the Wikipedia articles of the text representations. To tackle these problems, We develop a novel method to cluster the redundant or similar entries by similarity measurement based on Paragraph Vector (PV), a neural network language model. Results of experiments on four datasets show that our framework could gain better performance in relatedness accuracy against ESA.

机译：显式语义分析（ESA）是一种有效的方法，该方法利用Wikipedia条目（文章）来表示文本并计算文本对的语义相关性（SR）。与普通的Web搜索技术类似，由于Wikipedia条目数量的不断增加，ESA也遭受了冗余问题。条目冗余可能会导致有偏见的表示形式，这种表示形式特别强调来自大量相似条目的语义。另一方面，用于SR的原始ESA具有一个弱点，即它不考虑文本表示形式的Wikipedia文章之间的相关性或相似性。为了解决这些问题，我们开发了一种新的方法，以基于神经网络语言模型的段落向量（PV）进行相似度度量来对冗余项或相似项进行聚类。在四个数据集上进行的实验结果表明，我们的框架在针对ESA的关联性准确性方面可以获得更好的性能。

著录项

来源
《Web technologies and applications》|2015年|647-657|共11页
会议地点 Guangzhou(CN)
作者
Hai-Tao Zheng; Wenzhen Wu;
展开▼
作者单位

Graduate School at Shenzhen, Tsinghua University, China;

Graduate School at Shenzhen, Tsinghua University, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Semantic Relatedness; Explicit Semantic Analysis; Paragraph Vector; Clustering;

机译：语义相关性；显式语义分析；段落向量；聚类;
入库时间 2022-08-26 14:26:58

相似文献

外文文献
中文文献
专利

1. (050-051) Proposals to add a new interpretative paragraph with new Examples to Article 36, dealing with certain designations published without explicit acceptance [J] . Sennikov Alexander N., Barkworth Mary E., Welker Cassiano A. D., Taxon . 2015,第3期

机译：（050-051）关于在第36条中增加新解释性段落和新示例的提案，涉及未明确接受而发布的某些指定
2. Microblog summarization using Paragraph Vector and semantic structure [J] . Wang Ruiyi, Luo Senlin, Pan Limin, Computer speech and language . 2019,第SEPa期

机译：使用段落向量和语义结构的微博摘要
3. Microblog summarization using Paragraph Vector and semantic structure [J] . Wang Ruiyi, Luo Senlin, Pan Limin, Computer speech and language . 2019,第Sepa期

机译：使用段向量和语义结构的微博摘要
4. Boosting Explicit Semantic Analysis by Clustering Paragraph Vectors of Wikipedia Articles [C] . Hai-Tao Zheng, Wenzhen Wu Asia-Pacific Web Conference . 2015

机译：通过维基百科文章的聚类段落向量提高显式语义分析
5. Multilingual Knowledge Production and Dissemination in Wikipedia: A Spatial Narrative Analysis of the Collaborative Construction of City-Related Articles Within the User-Generated Encyclopaedia [D] . Jones, Henry A. 2017

机译：维基百科的多语言知识生产和传播：在用户生成的百科全书地区与城市相关文章协作建设的空间叙述分析
6. Colloquium PaperMapping Knowledge Domains: From paragraph to graph: Latent semantic analysis for information visualization [O] . Thomas K. Landauer, Darrell Laham, Marcia Derr 2004

机译：专题讨论会论文制图知识领域：从段落到图形：信息可视化的潜在语义分析
7. KMI, The Open University at NTCIR-9 CrossLink: Cross-Lingual Link Discovery in Wikipedia using explicit semantic analysis [O] . Knoth Petr, Zilka Lukas, Zdrahal Zdenek 2011

机译：KmI，NTCIR-9开放大学CrossLink：使用显式语义分析在维基百科中进行跨语言链接发现

Boosting Explicit Semantic Analysis by Clustering Paragraph Vectors of Wikipedia Articles

摘要

著录项

相似文献

相关主题

期刊订阅