首页> 外文会议>Conference on Intelligent Text Processing and Computational Linguistics;CICLing 2014 >Knowledge Discovery with CRF-Based Clustering of Named Entities without a Priori Classes
【24h】

Knowledge Discovery with CRF-Based Clustering of Named Entities without a Priori Classes

机译:知识发现与基于CRF的命名实体的群集,没有先验的课程

获取原文

摘要

Knowledge discovery aims at bringing out coherent groups of entities. It is usually based on clustering which necessitates defining a notion of similarity between the relevant entities. In this paper, we propose to divert a supervised machine learning technique (namely Conditional Random Fields, widely used for supervised labeling tasks) in order to calculate, indirectly and without supervision, similarities among text sequences. Our approach consists in generating artificial labeling problems on the data to reveal regularities between entities through their labeling. We describe how this framework can be implemented and experiment it on two information extraction/discovery tasks. The results demonstrate the usefulness of this unsupervised approach, and open many avenues for defining similarities for complex representations of textual data.
机译:知识发现旨在结合一群实体。 它通常基于群集,这需要定义相关实体之间的相似概念。 在本文中,我们建议转移监督机器学习技术(即条件随机字段,广泛用于监督标签任务),以便间接和无监督,文本序列之间的相似性。 我们的方法包括在数据上生成人工标签问题,以通过标签揭示实体之间的规律。 我们描述了如何在两个信息提取/发现任务中实现和实验该框架。 结果表明了这种无监督方法的有用性,并开辟了许多途径,用于定义文本数据的复杂表示的相似之处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号