Leveraging Rich Linguistic Features for Cross-domain Chinese Segmentation

机译：利用丰富的语言功能进行跨域中文分割

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper describes the system that we use for Chinese segmentation task in the 3rd CIPS-SIGHAN bakeoff. We use character sequence labeling method for segmentation, and in order to improve segmentation accuracy over multi-domain, we present a CRF-based Chinese segmentation system integrating supervised, un-supervised and lexical features. We firstly preliminarily segment the target data using CRF model trained over three types of features mentioned above, from the result of which new words are detected and absorbed into the lexicon. To generalize across different domains, we then execute the second segment with the updated lexicon. The OOV recognition is further promoted with refined post processing. All the features we used share a unified feature template trained by CRF. Our system achieves a competitive F score of 0.9730 for this bakeoff.

机译：本文介绍了在第三次CIPS-SIGHAN审核中用于中文细分任务的系统。我们使用字符序列标记方法进行分割，为了提高在多域上的分割精度，我们提出了一种基于CRF的中文分割系统，该系统集成了监督，非监督和词汇功能。我们首先使用在上述三种特征上训练过的CRF模型对目标数据进行初步分割，从中发现新词并将其吸收到词典中。为了跨不同领域进行概括，我们然后使用更新的词典执行第二段。精细的后处理进一步提高了OOV的识别度。我们使用的所有功能共享由CRF训练的统一功能模板。我们的系统在此次烘烤中获得了竞争性F得分0.9730。

著录项

来源
《CIPS-SIGHAN joint conference on Chinese language processing》|2014年|101-107|共7页
会议地点 Wuhan(CN)
作者
Guohua Wu; Dezhu He; Keli Zhong; Xue Zhou; Caixia Yuan;
展开▼
作者单位

School of Computer Beijing University of Posts and Telecommunications China 100876;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Accurate and efficient cross-domain visual matching leveraging multiple feature representations [J] . Gang Sun, Shuhui Wang, Xuehui Liu, The Visual Computer . 2013,第6a8期

机译：利用多种特征表示，进行准确，高效的跨域视觉匹配
2. Market segmentation for a leverage revitalization of China's inbound tourism: the case of US leisure tourists [J] . Qu Ying, Qu Hailin, Chen Ganghua Current issues in tourism . 2018,第1a6期

机译：细分市场以振兴中国入境旅游业：以美国休闲游客为例
3. Improved Arabic–Chinese Machine Translation with Linguistic Input Features [J] . Fares Aqlan, Xiaoping Fan, Abdullah Alqwbani, Future Internet . 2019,第1期

机译：具有语言输入功能的改进的阿拉伯语-中文机器翻译
4. Leveraging Rich Linguistic Features for Cross-domain Chinese Segmentation [C] . Guohua Wu, Dezhu He, Keli Zhong, CIPS-SIGHAN joint conference on Chinese language processing . 2012

机译：利用丰富的语言特征，为跨域中文分割
5. Abstract Meaning Representation Parsing with Rich Linguistic Features [D] . Chen, Wei-Te. 2017

机译：具有丰富语言特征的抽象意义表示分析
6. Learning rich features with hybrid loss for brain tumor segmentation [O] . Daobin Huang, Minghui Wang, Ling Zhang, 2021

机译：学习具有脑肿瘤细分的杂种损失的富含特征
7. Leveraging Rich Linguistic Features for Cross-domain Chinese Segmentation [O] . Guohua Wu, Dezhu He, Keli Zhong, 2015

机译：利用丰富的语言特征进行跨域中文分词

Leveraging Rich Linguistic Features for Cross-domain Chinese Segmentation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅