【24h】

Turkish spelling error detection and correction by using word n-grams

机译:使用单词n-gram进行土耳其语拼写错误检测和更正

获取原文

摘要

N-grams can be used for spelling check and correction processes. The first step to use n-grams is to find the language specific n-grams by using a corpus. But a corpus cannot be big enough to contain all the possible word n-grams. Back-off smoothing technique is one of the techniques to estimate the frequency of the unknown n-grams in a corpus. By using Back-off technique and the Minimum Edit Distance (MED) algorithm, a program was developed to check spelling errors and suggest corrections in a sentence typed in Turkish. The results were compared with the results of Microsoft Word 2003 proofing tools, and found to be much better.
机译:N-gram可用于拼写检查和更正过程。使用n-gram的第一步是通过使用语料库查找特定于语言的n-gram。但是,语料库的大小不足以容纳所有可能的单词n-gram。后退平滑技术是一种估计语料库中未知n-gram频率的技术之一。通过使用退避技术和最小编辑距离(MED)算法,开发了一个程序来检查拼写错误并建议用土耳其语键入的句子中的更正。将结果与Microsoft Word 2003校对工具的结果进行了比较,发现结果要好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号