【24h】

Diachronic Deviation Features in Continuous Space Word Representations

机译:连续空间字表示法中的历时偏差特征

获取原文

摘要

In distributed word representation, each word is represented as a unique point in the vector space. This paper extends this to a diachronic setting, where multiple word embeddings are generated with corpora in different time periods. These multiple embeddings can be mapped to a single target space via a linear transformation. In this target space each word is thus represented as a distribution. The deviation features of this distribution can reflect the semantic variation of words through different time periods. Experiments show that word groups with similar deviation features can indicate the hot topics in different ages. And the frequency change of these word groups can be used to detect the age of peak celebrity of the topics in the history.
机译:在分布式单词表示中,每个单词都表示为向量空间中的唯一点。本文将其扩展到一个历时性设置,在该设置中,在不同时间段内使用语料库生成多个单词嵌入。这些多个嵌入可以通过线性变换映射到单个目标空间。因此,在该目标空间中,每个单词都表示为一个分布。这种分布的偏离特征可以反映出不同时间段内单词的语义变化。实验表明,具有相似偏差特征的词组可以指示不同年龄段的热门话题。这些词组的频率变化可以用来检测历史中话题的最高名人年龄。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号