RECOGNIZE PERSON NAMES FROM CHINESE TEXTS BASED ON CLUSTERING SVM

机译：根据群集SVM认识来自中文文本的人名

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a method of recognizing person names from Chinese texts based on clustering Support Vector Machine (SVM). The character itself, character-based part-of-speech (POS) tag, the information whether a character is a surname, the frequency of a character in person names table and context information are extracted as the features of the vectors in SVM algorithm. A training set is established. However, there exists imbalance between two class samples in practical training sets, so the training set was clustered using the kernel κ-means clustering algorithm. The experimental results show that the model of recognizing Chinese person names based on clustering SVM is more efficient than the original one without clustering. The model can also be used for recognizing other named entity such as location names and organization names and can be generalized to the fields of machine learning with unbalanced class distribution.

机译：本文提出了一种识别基于聚类支持向量机（SVM）的中文文本的人名称的方法。字符本身，基于字符的语音部分（POS）标签，信息是否是姓氏的信息，人称名称表和上下文信息中的字符的频率被提取为SVM算法中的传感器的特征。建立培训集。但是，在实际训练集中的两个类样本之间存在不平衡，因此使用内核κ-均值聚类算法群集培训集。实验结果表明，基于聚类SVM识别中国人名称的模型比未群集的原始人更有效。该模型还可用于识别其他命名实体，例如位置名称和组织名称，并且可以通过不平衡的类分布概括为机器学习的字段。

著录项

来源
《IASTED International Conference on Intelligent Systems and Control》|2007年||共5页
会议地点
作者
Lishuang Li; Zhuoye Ding; Degen Huang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Recognition of Chinese person name; Entity name; Clustering SVM; Machine learning;

机译：认识中国人名;实体名称;聚类SVM;机器学习;

相似文献

外文文献
中文文献
专利

1. Recognizing names in biomedical texts using mutual information independence model and SVM plus sigmoid [J] . G.D. Zhou International journal of medical informatics . 2006,第6期

机译：使用互信息独立性模型和SVM加S形识别生物医学文本中的名称
2. A family cluster of infections by a newly recognized bunyavirus in eastern China, 2007: further evidence of person-to-person transmission. [J] . Bao CJ, Guo XL, Qi X, Clinical infectious diseases . 2011,第12期

机译：2007年，中国东部新发现的布尼亚病毒感染的家庭集群：人对人传播的进一步证据。
3. Utilizing the Relation Sets of Entity Pairs to Recognize the Organization Names in Chinese Short Text [J] . Xinghua Fan, Dong Zhou, Hang Yu Advanced Science Letters . 2012,第Null期

机译：利用实体对的关系集识别中文短文本中的组织名称
4. RECOGNIZE PERSON NAMES FROM CHINESE TEXTS BASED ON CLUSTERING SVM [C] . Lishuang Li, Zhuoye Ding, Degen Huang IASTED International Conference on Intelligent Systems and Control . 2007

机译：根据群集SVM认识来自中文文本的人名
5. Recognizing named entities in biomedical texts [D] . Gu, Baohua. 2008

机译：识别生物医学文本中的命名实体
6. Analysis of big data job requirements based on K-means text clustering in China [O] . Dai Debao, Ma Yinxia, Zhao Min, 2021

机译：基于K-MESS文本聚类的大数据职能分析
7. Czech Named Entity Corpus and SVM-based Recognizer [O] . Jana Kravalová 2010

机译：捷克命名为实体语料库和基于SVM的识别器

RECOGNIZE PERSON NAMES FROM CHINESE TEXTS BASED ON CLUSTERING SVM

摘要

著录项

相似文献

相关主题

期刊订阅