【24h】

Learning Concepts from Text Based on the Inner-Constructive Model

机译:基于内在建构模型的文本学习概念

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a new model for automatic acquisition of lexical concepts from text, referred to as Concept Inner-Constructive Model (CICM). The CICM clarifies the rules when words construct concepts through four aspects including (1) parts of speech, (2) syllable, (3) senses and (4) attributes. Firstly, we extract a large number of candidate concepts using lexico-patterns and confirm a part of them to be concepts if they matched enough patterns for some times. Then we learn CICMs using the confirmed concepts automatically and distinguish more concepts with the model. Essentially, the CICM is an instances learning model but it differs from most existing models in that it takes into account a variety of linguistic features and statistical features of words as well. And for more effective analogy when learning new concepts using CICMs, we cluster similar words based on density. The effectiveness of our method has been evaluated on a 160G raw corpus and 5,344,982 concepts are extracted with a precision of 89.11% and a recall of 84.23%.
机译:本文提出了一种新的自动从文本中获取词汇概念的模型,称为概念内部构造模型(CICM)。 CICM通过四个方面来阐明单词构成概念的规则,这些方面包括(1)词性,(2)音节,(3)感官和(4)属性。首先,我们使用词汇模式提取大量候选概念,并在一段时间内匹配足够的模式后确认其中一部分为概念。然后,我们使用已确认的概念自动学习CICM,并通过模型区分更多概念。本质上,CICM是实例学习模型,但它与大多数现有模型不同,因为它同时考虑了单词的多种语言特征和统计特征。为了在使用CICM学习新概念时更有效地进行类比,我们基于密度对相似词进行聚类。我们的方法的有效性已在160G原始语料库上进行了评估,提取了5,344,982个概念,精确度为89.11%,召回率为84.23%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号