首页> 外文会议>9th International conference on language resources and evaluation >A Graph-Based Approach for Computing Free Word Associations
【24h】

A Graph-Based Approach for Computing Free Word Associations

机译:基于图形的免费Word关联方法

获取原文

摘要

A graph-based algorithm is used to analyze the co-occurrences of words in the British National Corpus. It is shown that the statistical regularities detected can be exploited to predict human word associations. The corpus-derived associations are evaluated using a large test set comprising several thousand stimulus/response pairs as collected from humans. The finding is that there is a high agreement between the two types of data. The considerable size of the test set allows us to split the stimulus words into a number of classes relating to particular word properties. For example, we construct six saliency classes, and for the words in each of these classes we compare the simulation results with the human data. It turns out that for each class there is a close relationship between the performance of our system and human performance. This is also the case for classes based on two other properties of words, namely syntactic and semantic word ambiguity. We interpret these findings as evidence for the claim that human association acquisition must be based on the statistical analysis of perceived language, and that when producing associations the detected statistical regularities are replicated.
机译:基于图形的算法用于分析英国国家语料库中的单词共同发生。结果表明,可以利用检测到的统计规则来预测人类词的关联。使用包括从人类收集的大量刺激/响应对的大测试集来评估语料库衍生的关联。该发现是两种数据之间存在高协议。相当大尺寸的测试集允许我们将刺激单词拆分为与特定字属性相关的多个类别。例如,我们构建六个显着性等级,并且对于这些类中的每一个中的单词,我们将模拟结果与人类数据进行比较。事实证明,对于每个课程,我们的系统性能与人类性能之间存在密切的关系。对于基于两个其他单词的其他属性,即句法和语义词歧义,这也是类的情况。我们将这些调查结果解释为索赔人类协会收购必须基于对感知语言的统计分析,并且在制定相关的统计规则时复制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号