Improvement of POS tagger and kana kanji converter by an untagged corpus

Shinsuke Mori; Nobuyasu Itoh

首页> 外文期刊>電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication >Improvement of POS tagger and kana kanji converter by an untagged corpus

【24h】

Improvement of POS tagger and kana kanji converter by an untagged corpus

机译：Improvement of POS tagger and kana kanji converter by an untagged corpus

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

A tagged corpus plays an important role in natural language processing based on a stochastic language model and increasing the corpus size improves the accuracy. It is, however, necessary for a meaningful improvement to increase a corpus size more than exponentially and an annotation cost needed for it is not negligible. In this paper, we discuss the usage of an untagged corpus. In the experiments, using an untagged corpus improved the predictive power of a stochastic language model and the accuracy of a kana-kanji converter based on it. But for a tagger the improvement was slight.

著录项

来源
《電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication》 |2001年第189期|47-54|共8页
作者
Shinsuke Mori; Nobuyasu Itoh;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种日语
中图分类通信;
关键词
Kana-kanji converter; Stochastic language model; Corpus; Morphological analysis; Untagged;

Improvement of POS tagger and kana kanji converter by an untagged corpus

摘要

著录项

相关主题

期刊订阅