首页> 外文会议>9th International conference on language resources and evaluation >A Graph-Based Approach for Computing Free Word Associations
【24h】

A Graph-Based Approach for Computing Free Word Associations

机译:一种基于图的自由词联想计算方法

获取原文

摘要

A graph-based algorithm is used to analyze the co-occurrences of words in the British National Corpus. It is shown that the statistical regularities detected can be exploited to predict human word associations. The corpus-derived associations are evaluated using a large test set comprising several thousand stimulus/response pairs as collected from humans. The finding is that there is a high agreement between the two types of data. The considerable size of the test set allows us to split the stimulus words into a number of classes relating to particular word properties. For example, we construct six saliency classes, and for the words in each of these classes we compare the simulation results with the human data. It turns out that for each class there is a close relationship between the performance of our system and human performance. This is also the case for classes based on two other properties of words, namely syntactic and semantic word ambiguity. We interpret these findings as evidence for the claim that human association acquisition must be based on the statistical analysis of perceived language, and that when producing associations the detected statistical regularities are replicated.
机译:基于图的算法用于分析英国国家语料库中单词的共现。结果表明,所检测到的统计规律可用于预测人类单词联想。使用大型测试集评估语料库派生的关联,该测试集包含从人类收集的数千个刺激/响应对。发现是两种类型的数据之间有很高的一致性。测试集的相当大的规模使我们可以将刺激词分为与特定词属性有关的多个类别。例如,我们构造了六个显着性类,并且针对每个这些类中的单词,我们将模拟结果与人类数据进行了比较。事实证明,对于每个班级,我们的系统性能与人员绩效之间都有密切的关系。对于基于单词的其他两个属性(即句法和语义单词歧义性)的类,情况也是如此。我们将这些发现解释为以下主张的证据:人类联想获取必须基于对感知语言的统计分析,并且在产生联想时会复制检测到的统计规律。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号