Integration of Named Entity Information for Chinese Word Segmentation Based on Maximum Entropy

机译：基于最大熵的中文字分割的命名实体信息集成

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Word segmentation is an essential process in Chinese information processing. Although related researches were reported and made progresses, the Unknown Named Entity (UNE) problem in segmentation is not fully solved. This usually degrades the accuracy of segmentation in general. In this paper, a model to identify UNEs for improving the overall performance of the segmentation is presented. In order to capture the NE information, functions of characters or words are defined with tags. In addition, useful surrounding contexts are collected from a corpus and used as features. The model is constructed based on Maximum Entropy to handle the UNE identification as tagging problem. Empirical experiments show that the overall accuracy of the segmentation is improved after integrating the UNE identification module into the word segmenter.

机译：单词分割是中文信息处理中的重要过程。虽然报告并取得了相关的研究并取得了进展，但分割中未知的命名实体（UNE）问题尚未完全解决。这通常会降低分割的准确性。在本文中，提出了一种用于识别联合国来提高分割的整体性能的模型。为了捕获网元信息，用标记定义字符或单词的功能。此外，有用的周围上下文从语料库中收集并用作特征。该模型基于最大熵构造，以处理UNE识别作为标记问题。经验实验表明，在将UNE识别模块集成到单词分段器之后，分段的整体精度得到改善。

著录项

来源
《Interntional Conference on Intelligent Computing》|2008年||共8页
会议地点
作者
Ka Seng Leong; Fai Wong; Yiping Li; Ming Chui Dong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. Chinese word segmentation and named entity recognition: A pragmatic approach [J] . Gao JF, Li M, Wu A, Computational linguistics . 2005,第4期

机译：中文分词与命名实体识别：一种务实的方法
2. Chinese word segmentation and named entity recognition: A pragmatic approach [J] . Gao JF, Li M, Wu A, Computational linguistics . 2005,第4期

机译：中文分词与命名实体识别：一种务实的方法
3. Feature selection techniques for maximum entropy based biomedical named entity recognition. [J] . Saha SK, Sarkar S, Mitra P Journal of biomedical informatics. . 2009,第5期

机译：基于最大熵的生物医学命名实体识别的特征选择技术。
4. Integration of Named Entity Information for Chinese Word Segmentation Based on Maximum Entropy [C] . Ka Seng Leong, Fai Wong, Yiping Li, International Conference on Intelligent Computing;ICIC 2008 . 2008

机译：基于最大熵的汉语分词命名实体信息集成
5. A maximum entropy approach to named entity recognition. [D] . Borthwick, Andrew Eliot. 1999

机译：命名实体识别的最大熵方法。
6. Joint segmentation and named entity recognition using dual decomposition in Chinese discharge summaries [O] . Yan Xu, Yining Wang, Tianren Liu, 2014

机译：中文放电摘要中使用双重分解的联合分割和命名实体识别
7. A COMPARATIVE STUDY OF WORD REPRESENTATION METHODS WITH CONDITIONAL RANDOM FIELDS AND MAXIMUM ENTROPY MARKOV FOR BIO-NAMED ENTITY RECOGNITION [O] . Maan Tareq Abd, Masnizah Mohd 2018

机译：有条件随机字段和最大熵的词语表示方法对生物命名实体识别的最大熵

Integration of Named Entity Information for Chinese Word Segmentation Based on Maximum Entropy

摘要

著录项

相似文献

相关主题

期刊订阅