首页> 外文会议>SIGFSM workshop on statistical NLP and weighted automata >Data-Driven Spelling Correction using Weighted Finite-State Methods
【24h】

Data-Driven Spelling Correction using Weighted Finite-State Methods

机译:使用加权有限状态方法的数据驱动的拼写校正

获取原文
获取原文并翻译 | 示例

摘要

This paper presents two systems for spelling correction formulated as a sequence labeling task. One of the systems is an unstructured classifier and the other one is structured. Both systems are implemented using weighted finite-state methods. The structured system delivers state-of-the-art results on the task of tweet normalization when compared with the recent AliSeTra system introduced by Eger et al. (2016) even though the system presented in the paper is simpler than AliSeTra because it does not include a model for input segmentation. In addition to experiments on tweet normalization, we present experiments on OCR post-processing using an Early Modern Finnish corpus of OCR processed newspaper text.
机译:本文介绍了两种用于拼写纠正的系统,它们被构造为序列标签任务。其中一个系统是非结构化分类器,另一个是结构化的。这两个系统都是使用加权有限状态方法实现的。与最近由Eger等人介绍的AliSeTra系统相比,结构化系统在推文规范化任务上提供了最新技术成果。 (2016),尽管本文中介绍的系统比AliSeTra更简单,因为它不包括输入分割模型。除了有关推文归一化的实验外,我们还介绍了使用OCR处理的报纸文字的现代芬兰语料库进行OCR后处理的实验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号