首页> 外文会议>IASTED International Conference on Intelligent Systems and Control >RECOGNIZE PERSON NAMES FROM CHINESE TEXTS BASED ON CLUSTERING SVM
【24h】

RECOGNIZE PERSON NAMES FROM CHINESE TEXTS BASED ON CLUSTERING SVM

机译:根据群集SVM认识来自中文文本的人名

获取原文

摘要

This paper presents a method of recognizing person names from Chinese texts based on clustering Support Vector Machine (SVM). The character itself, character-based part-of-speech (POS) tag, the information whether a character is a surname, the frequency of a character in person names table and context information are extracted as the features of the vectors in SVM algorithm. A training set is established. However, there exists imbalance between two class samples in practical training sets, so the training set was clustered using the kernel κ-means clustering algorithm. The experimental results show that the model of recognizing Chinese person names based on clustering SVM is more efficient than the original one without clustering. The model can also be used for recognizing other named entity such as location names and organization names and can be generalized to the fields of machine learning with unbalanced class distribution.
机译:本文提出了一种识别基于聚类支持向量机(SVM)的中文文本的人名称的方法。字符本身,基于字符的语音部分(POS)标签,信息是否是姓氏的信息,人称名称表和上下文信息中的字符的频率被提取为SVM算法中的传感器的特征。建立培训集。但是,在实际训练集中的两个类样本之间存在不平衡,因此使用内核κ-均值聚类算法群集培训集。实验结果表明,基于聚类SVM识别中国人名称的模型比未群集的原始人更有效。该模型还可用于识别其他命名实体,例如位置名称和组织名称,并且可以通过不平衡的类分布概括为机器学习的字段。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号