首页> 外文会议>IJCNLP 2011 >Automatic Labeling of Voiced Consonants for Morphological Analysis of Modern Japanese Literature
【24h】

Automatic Labeling of Voiced Consonants for Morphological Analysis of Modern Japanese Literature

机译:现代日本文学形态分析的浊音辅音自动标记

获取原文

摘要

Since the present-day Japanese use of voiced consonant mark had established in the Meiji Era, modern Japanese literary text written in the Meiji Era often lacks compulsory voiced consonant marks. This deteriorates the performance of morphological analyzers using ordinary dictionary. In this paper, we propose an approach for automatic labeling of voiced consonant marks for modern literary Japanese. We formulate the task into a binary classification problem. Our pointwise prediction method uses as its feature set only surface information about the surrounding character strings. As a consequence, training corpus is easy to obtain and maintain because we can exploit a partially annotated corpus for learning. We compared our proposed method as a preprocessing step for morphological analysis with a dictionary-based approach, and confirmed that pointwise prediction outperforms dictionary-based approach by a large margin.
机译:自日本日本日本人在明治时代建立了浊音辅音标志以来,在明治时代写的现代日本文学课程往往缺乏强制性的辅音标志。这种使用普通字典劣化了形态分析仪的性能。本文提出了一种自动标记现代文学日语的浊音辅音标记的方法。我们将任务制定为二进制分类问题。我们的尖端预测方法用作其特征仅设置关于周围字符串的表面信息。因此,培训语料库很容易获得和维护,因为我们可以利用部分注释的学习语料库。我们将所提出的方法与以大字典的方法进行形态分析的预处理步骤进行了比较,并确认了通过大边距的基于字典的方法令人省份的预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号