首页> 外文期刊>Neurocomputing >Federating clustering and cluster labelling capabilities with a single approach based on feature maximization: French verb classes identification with IGNGF neural clustering
【24h】

Federating clustering and cluster labelling capabilities with a single approach based on feature maximization: French verb classes identification with IGNGF neural clustering

机译:基于特征最大化的单一方法联合聚类和聚类标记功能:使用IGNGF神经聚类的法语动词类识别

获取原文
获取原文并翻译 | 示例

摘要

Classifications which group together verbs and a set of shared syntactic and semantic properties have proven to be useful in both linguistics and Natural Language Processing tasks. However, most existing approaches for automatically acquiring verb classes fail to associate the verb classes produced with an explicit characterisation of the syntactic and semantic properties shared by the class elements. We propose a novel approach to verb clustering which addresses this shortcoming and permits building verb classifications whose classes group together verbs, subcategorisation frames and thematic grids. Our approach involves the use of a recent neural clustering method called IGNGF (Incremental Growing Neural Gas with Feature maximization). The use of a standard distance measure for determining a winner is replaced in IGNGF by feature maximisation measure relying on the features of the data that are associated with clusters during learning. A main advantage of the method is that maximised features used by IGNGF during learning can also be exploited in a final step for accurately labelling the resulting clusters. In this paper, we exploit IGNGF for the unsupervised classification of French verbs and evaluate the obtained clusters (i.e., verb classes) in two different ways. The first way is a quantitative analysis of the clustering process relying on a usual gold standard and on complementary unbiased clustering quality indexes. The second way is a qualitative analysis of the cluster labelling process. Relying on an adapted gold standard, we evaluate the capacity of the IGNGF clusters labels (i.e., subcategorisation frames and thematic grids) to be exploited for bootstraping a VerbNet-like classification for French. Both analyses clearly highlight the advantages of the approach.
机译:事实证明,将动词以及一组共享的句法和语义属性组合在一起的分类在语言学和自然语言处理任务中都是有用的。但是,大多数现有的自动获取动词类的方法都无法将产生的动词类与类元素共享的句法和语义属性的显式表征相关联。我们提出了一种新的动词聚类方法,该方法解决了该缺点,并允许建立动词分类,其类别将动词,子分类框架和主题网格组合在一起。我们的方法涉及使用一种最新的神经聚类方法,称为IGNGF(具有特征最大化的增量生长神经气体)。在IGNGF中,依靠依赖于学习过程中与聚类相关联的数据特征的特征最大化度量来代替使用标准距离度量来确定获胜者。该方法的主要优点是,IGNGF在学习过程中使用的最大化功能也可以在最后一步中加以利用,以准确标记生成的群集。在本文中,我们将IGNGF用于法语动词的无监督分类,并以两种不同的方式评估获得的聚类(即动词类别)。第一种方法是根据通常的金标准和互补的无偏聚类质量指标对聚类过程进行定量分析。第二种方法是对群集标记过程的定性分析。依靠经过调整的黄金标准,我们评估了IGNGF群集标签(即子分类框架和主题网格)的能力,可用于引导为法语类似VerbNet的分类。两种分析都清楚地表明了该方法的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号