Data-Driven Spelling Correction using Weighted Finite-State Methods

机译：使用加权有限状态方法的数据驱动的拼写校正

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents two systems for spelling correction formulated as a sequence labeling task. One of the systems is an unstructured classifier and the other one is structured. Both systems are implemented using weighted finite-state methods. The structured system delivers state-of-the-art results on the task of tweet normalization when compared with the recent AliSeTra system introduced by Eger et al. (2016) even though the system presented in the paper is simpler than AliSeTra because it does not include a model for input segmentation. In addition to experiments on tweet normalization, we present experiments on OCR post-processing using an Early Modern Finnish corpus of OCR processed newspaper text.

机译：本文介绍了两种用于拼写纠正的系统，它们被构造为序列标签任务。其中一个系统是非结构化分类器，另一个是结构化的。这两个系统都是使用加权有限状态方法实现的。与最近由Eger等人介绍的AliSeTra系统相比，结构化系统在推文规范化任务上提供了最新技术成果。（2016），尽管本文中介绍的系统比AliSeTra更简单，因为它不包括输入分割模型。除了有关推文归一化的实验外，我们还介绍了使用OCR处理的报纸文字的现代芬兰语料库进行OCR后处理的实验。

著录项

来源
《SIGFSM workshop on statistical NLP and weighted automata》|2016年|51-59|共9页
会议地点 Berlin(DE)
作者
Miikka Silfverberg; Pekka Kauppinen; Krister Linden;
展开▼
作者单位

Department of Modern Languages, University of Helsinki;

Department of Modern Languages, University of Helsinki;

Department of Modern Languages, University of Helsinki;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction [J] . Kemal Oflazer Computational linguistics . 1996,第1期

机译：容错有限状态识别及其在形态分析和拼写校正中的应用
2. Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition [J] . Kubo Y., Watanabe S., Hori T., Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第8期

机译：基于加权有限状态传感器的语音识别结构分类方法
3. An Evaluation of retrieval Effectiveness Using Spelling- Correction and String-Similarity Matching Methods on Malay Texts [J] . Zainab Abu Bakar, Tengku Mohd T. Sembok, Mohammed Yusoff Journal of the American Society for Information Science . 2000,第8期

机译：使用拼写校正和字符串相似性匹配方法评估马来文本的检索效果
4. Data-Driven Spelling Correction using Weighted Finite-State Methods [C] . Miikka Silfverberg, Pekka Kauppinen, Krister Linden Annual meeting of the Association for Computational Linguistics . 2016

机译：使用加权有限状态方法的数据驱动拼写校正
5. Weighted Factor Automata: A Finite-State Framework for Spoken Content Retrieval [D] . Can, Dogan. 2018

机译：加权因子自动机：语音内容检索的有限状态框架
6. An Ensemble Method for Spelling Correction in Consumer Health Questions [O] . Halil Kilicoglu, Marcelo Fiszman, Kirk Roberts, 2015

机译：消费者健康问题中拼写校正的综合方法
7. Data-Driven Spelling Correction using Weighted Finite-State Methods [O] . Silfverberg, Miikka, Kauppinen, Pekka, Linden, Krister 2016

机译：使用加权有限状态方法的数据驱动的拼写校正

Data-Driven Spelling Correction using Weighted Finite-State Methods

摘要

著录项

相似文献

相关主题

期刊订阅