首页> 外文会议>IEEE International Scientific Conference on Informatics >Automatic restoration of diacritics based on word n-grams for Slovak texts
【24h】

Automatic restoration of diacritics based on word n-grams for Slovak texts

机译:基于Word n克的斯洛伐克文本自动恢复变形物

获取原文

摘要

In the past and even now, many people still write texts without diacritics, especially in chat messages, e-mails or discussion posts. This issue evolved from historical reasons when people had a problem with text encoding in messages or wanted to write them faster. In this paper, we propose an algorithm based on word n-grams (contiguous sequence of n words) that restore diacritics of text written in the Slovak language. We also compare and evaluate our results with existing algorithms developed for Slovak texts.
机译:在过去甚至现在,许多人仍然在没有复杂的情况下写文本,特别是在聊天消息,电子邮件或讨论帖中。 当人们在消息中编码的文本或想要更快地写入时,这个问题会从历史原因中发展。 在本文中,我们提出了一种基于Word N-Grams(连续序列的N个单词)算法,该算法恢复在斯洛伐克语中写入的文本的复杂性。 我们还使用为斯洛伐克文本开发的现有算法进行比较和评估我们的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号