首页> 外文会议>Signal Processing and Communications Applications Conference >Correcting writing errors in turkish with a character-level neural language model
【24h】

Correcting writing errors in turkish with a character-level neural language model

机译:使用字符级神经语言模型纠正土耳其语中的书写错误

获取原文
获取外文期刊封面目录资料

摘要

A large part of the written content on the Internet is composed of social media posts, articles written for content platforms and user comments. In contrast to the content prepared for print media, these types of texts include a large number of writing errors. Automating the detection and correction of writing errors in content created for commercial purposes would decrease editing costs dramatically. Although word-level language models have performed well in processing analytic languages, they are not ideal for agglutinative languages, which include Turkish. Models built on smaller elements such as morphemes or characters are more suitable for agglutinative languages. In this study, we propose a method that uses a character-level language model to correct writing errors in Turkish. Character-level text generation is used to calculate the probabilities of possible syntaxes. The syntax that is the most probable is inferred to be correct. The proposed method is implemented to correct errors in writing the conjunction “de” and the suffix “-de”.
机译:互联网上的大部分书面内容由社交媒体帖子,为内容平台撰写的文章和用户评论组成。与为打印介质准备的内容相反,这些类型的文本包含大量书写错误。自动检测和纠正为商业目的而创建的内容中的书写错误将大大降低编辑成本。尽管单词级语言模型在处理分析语言方面表现良好,但对于包括土耳其语在内的凝集性语言而言,它们并不是理想的选择。基于词素或字符等较小元素的模型更适合于凝集语言。在这项研究中,我们提出了一种使用字符级语言模型来纠正土耳其语书写错误的方法。字符级文本生成用于计算可能语法的概率。推断最可能的语法是正确的。实施所提出的方法以纠正写连词“ de”和后缀“ -de”时的错误。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号