Chinese Spell Checking Based on Noisy Channel Model

机译：基于嘈杂频道模型的汉语拼写检查

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Chinese spell checking is an important component of many NLP applications, including word processors, search engines, and automatic essay rating. Compared to English, Chinese has no word boundaries and there are various Chinese input methods that cause different kinds of typos, so it is more difficult to develop spell checkers for Chinese. In this paper, we introduce a novel method for correcting Chinese typographical errors based on sound or shape similarity. In our approach, similar characters are automatically generated using Web corpora, and potential typos in a given sentence are then corrected using a channel model and a character-based language model in the noisy channel model. In the training phase, we estimate the channel probabilities for each character based on ngrams in Web corpus. At run-time, the system generates correction candidates for each character in the given sentence and selects the appropriate correction using the channel model and the language model.

机译：中文拼写检查是许多NLP应用程序的重要组成部分，包括文字处理器，搜索引擎和自动论文评级。与英语相比，中国人没有单词界限，有各种汉语输入方法导致不同类型的错字，因此为中国人开发拼写检查是更困难的。在本文中，我们介绍了一种基于声音或形状相似性校正中文印刷误差的新方法。在我们的方法中，使用Web语料库自动生成类似的字符，然后使用噪声模型和嘈杂的频道模型中的基于字符的语言模型来纠正给定句子中的潜在键盘。在培训阶段，我们估计基于Web语料库中的Ngrams的每个字符的信道概率。在运行时，系统为给定句子中的每个字符生成校正候选，并使用通道模型和语言模型选择适当的校正。

著录项

来源
《CIPS-SIGHAN joint conference on Chinese language processing》|2012年||共8页
会议地点
作者
Hsun-wen Chiu; Jian-cheng Wu; Jason S. Chang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类汉语;
关键词

相似文献

外文文献
中文文献
专利

1. Window flow control with error-checking scheme in quasi-cut-through switching network with noisy channels [J] . Cho Y.J., Un C.K. IEEE Transactions on Communications . 1991,第3期

机译：具有噪声通道的准直通交换网络中带有错误检查方案的窗口流控制
2. Modeling wireless fading channels via stochastic differential equations: identification and estimation based on noisy measurements [J] . Charalambous C.D., Bultitude R.J.C., Li X., IEEE transactions on wireless communications . 2008,第2期

机译：通过随机微分方程为无线衰落信道建模：基于噪声测量的识别和估计
3. A sequence-based approximate MMSE decoder for source coding over noisy channels using discrete hidden Markov models [J] . Miller D.J., Moonseo Park IEEE Transactions on Communications . 1998,第2期

机译：基于序列的近似MMSE解码器，用于使用离散隐马尔可夫模型在嘈杂信道上进行源编码
4. Chinese Spell Checking Based on Noisy Channel Model [C] . Hsun-wen Chiu, Jian-cheng Wu, Jason S. Chang CIPS-SIGHAN joint conference on Chinese language processing . 2014

机译：基于噪声通道模型的中文拼写检查
5. Robust image transmission with rate-compatible low-density parity-check codes over noisy channels. [D] . Pan, Xiang. 2005

机译：在噪声通道上使用速率兼容的低密度奇偶校验码进行鲁棒的图像传输。
6. A UMLS-based spell checker for natural language processing in vaccine safety [O] . Herman D Tolentino, Michael D Matters, Wikke Walop, 2007

机译：基于UMLS的拼写检查器用于疫苗安全中的自然语言处理
7. Chinese Spell Checking Based on Noisy Channel Model [O] . Hsun-wen Chiu, Jian-cheng Wu, Jason S. Chang 2015

机译：基于噪声信道模型的汉语拼写检查

Chinese Spell Checking Based on Noisy Channel Model

摘要

著录项

相似文献

相关主题

期刊订阅