首页> 外文会议>Chinese Lexical Semantics Workshop >Named Entity Recognition for Chinese Novels in the Ming-Qing Dynasties
【24h】

Named Entity Recognition for Chinese Novels in the Ming-Qing Dynasties

机译:明清时期中国小说的名为实体认同

获取原文

摘要

This paper presents a Named Entity Recognition (NER) system for Chinese classic novels in the Ming and Qing dynasties using the Conditional Random Fields (CRFs) method. An annotated corpus of four influential vernacular novels produced during this period is used as both training and testing data. In the experiment, three novels are used as training data and one novel is used as the testing data. Three sets of features are proposed for the CRFs model: (1) baseline feature set, that is, word/POS and bigram for different window sizes, (2) dependency head and dependency relationship, and (3) Wikipedia categories. The F-measures for these four books range from 67% to 80%. Experiments show that using the dependency head and relationship as well as Wikipedia categories can improve the performance of the NER system. Compared with the second feature set, the third one can produce greater improvement.
机译:本文使用条件随机字段(CRFS)方法为中文和清朝中的中国古典小说提供了一个名为实体识别(NER)系统。在此期间产生的四个有影响力的白话小说的注释语料库用作训练和测试数据。在实验中,使用三种小说用作训练数据,并且将一本新颖用作测试数据。 CRFS模型提出了三组特征:(1)基线功能集,即不同窗口大小的Word / POS和BIGRAM,(2)依赖性头和依赖关系,以及(3)维基百科类。这四本书的F措施范围为67%至80%。实验表明,使用依赖性头和关系以及维基百科类别可以提高新系统的性能。与第二特征集相比,第三个功能可以产生更大的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号