首页> 外文会议>Conference on empirical methods in natural language processing;Conference on computational natural language learning >Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation
【24h】

Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation

机译:预测-自估计的迭代注释变换在中文分词中的应用

获取原文

摘要

In this paper we first describe the technology of automatic annotation transformation, which is based on the annotation adaptation algorithm (Jiang et al., 2009). It can automatically transform a human-annotated corpus from one annotation guideline to another. We then propose two optimization strategies, iterative training and predict-self reestimation, to further improve the accuracy of annotation guideline transformation. Experiments on Chinese word segmentation show that, the iterative training strategy together with predict-self reestimation brings significant improvement over the simple annotation transformation baseline, and leads to classifiers with significantly higher accuracy and several times faster processing than annotation adaptation does. On the Penn Chinese Treebank 5.0, it achieves an F-measure of 98.43%, significantly outperforms previous works although using a single classifier with only local features.
机译:在本文中,我们首先描述了自动注释转换技术,基于注释适应算法(江等,2009)。它可以自动将人类注释的语料库从一个注释指南转换为另一个注释指南。然后,我们提出了两种优化策略,迭代培训和预测 - 自我保证,以进一步提高注释指南转型的准确性。汉字分割实验表明,与预测自我评估的迭代培训策略与简单的注释转换基线带来了显着的改进,并导致分类器具有明显更高的准确性和比注释适应更快的处理更快的处理。在Penn Chinese TreeBank 5.0上,它达到了98.43%的F-Measure,显着优于以前的作品,尽管使用单个分类器,仅具有本地特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号