首页> 外文会议>2011 International Conference on Electrical Engineering and Informatics >A non word error spell checker for Indonesian using morphologically analyzer and HMM
【24h】

A non word error spell checker for Indonesian using morphologically analyzer and HMM

机译:使用形态分析仪和HMM的印度尼西亚语非单词错误拼写检查器

获取原文

摘要

Spell checker consists of two main methods, error detection and error correction. In this study, spell checker is built by using morphological analyzer and dictionary lookup as error detection method with two alternative optimization, binary search and hash. Whilst as for error correction, two alternative methods, namely forward reversed dictionary and probability of similarity is used. Forward reversed dictionary corrects the misspelled word by considering edit distance between the misspelled word and its candidates. Probability of similarity, which is the main proposed method for error correction, correct the misspelled word by calculating its similarity to a candidate word, based on the value of optimum subsequence between them. Candidate sorting was accomplished through the use of HMM (Hidden Markov Model), where the word is considered as observed state and the candidates as hidden state. By using HMM, the system does not only consider the similarity of the candidate word with misspelled words, but also consider the sequence of words in sentences where the word is located. The experiment result proves that sorting candidates by using HMM increase the precision accuracy. As for correction method, the result showed that using probability of similarity has better correctness accuracy than forward reversed dictionary.
机译:拼写检查器由两种主要方法组成,即错误检测和错误纠正。在这项研究中,拼写检查器是通过使用形态分析器和字典查找作为错误检测方法而构建的,并具有两个替代的优化方法:二进制搜索和哈希。至于纠错,使用了两种替代方法,即正向反向字典和相似概率。前向反向字典通过考虑拼写错误的单词与其候选单词之间的编辑距离来纠正拼写错误的单词。提出的主要纠错方法是相似性概率,它根据拼写错误的单词与候选单词之间的最佳子序列值来计算它们与候选单词的相似性,从而纠正拼写错误的单词。候选排序通过使用HMM(隐马尔可夫模型)完成,其中单词被视为观察状态,候选词被视为隐藏状态。通过使用HMM,系统不仅考虑候选单词与拼写错误的单词的相似性,而且考虑单词所在句子中单词的顺序。实验结果表明,使用HMM对候选者进行排序可以提高精度。对于校正方法,结果表明,使用相似度概率比使用正向反向字典具有更好的校正精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号