首页> 外文会议> >Principal Components for Automatic Term Hierarchy Building
【24h】

Principal Components for Automatic Term Hierarchy Building

机译:自动术语层次结构构建的主要组件

获取原文
获取原文并翻译 | 示例

摘要

We show that the singular value decomposition of a term similarity matrix induces a term hierarchy. This decomposition, used in Latent Semantic Analysis and Principal Component Analysis for text, aims at identifying "concepts" that can be used in place of the terms appearing in the documents. Unlike terms, concepts are by construction uncorrelated and hence are less sensitive to the particular vocabulary used in documents. In this work, we explore the relation between terms and concepts and show that for each term there exists a latent subspace dimension for which the term coincides with a concept. By varying the number of dimensions, terms similar but more specific than the concept can be identified, leading to a term hierarchy.
机译:我们表明,术语相似度矩阵的奇异值分解会诱导术语层次。在文本的潜在语义分析和主成分分析中使用的这种分解旨在识别可用来代替文档中出现的术语的“概念”。与术语不同,概念在构造上是不相关的,因此对文档中使用的特定词汇不太敏感。在这项工作中,我们探索了术语和概念之间的关系,并表明对于每个术语,存在一个潜在的子空间维,该术语与一个概念相对应。通过更改维度的数量,可以识别比概念更具体但更具体的术语,从而形成术语层次。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号