首页> 外文会议>Language and technology conference >Spanish Diacritic Error Detection and Restoration: A Survey
【24h】

Spanish Diacritic Error Detection and Restoration: A Survey

机译:西班牙音调符号错误检测和恢复:一项调查

获取原文

摘要

In this paper we address the problem of diacritic error detection and restoration-the task of identifying and correcting missing accents in text. In particular, we evaluate the performance of a simple part-of-speech tagger-based technique comparing it to other established methods for error detection/restoration: unigram frequency, decision lists, discriminative classifiers, a machine-translation based method, and grapheme-based approaches. In languages such as Spanish (the focus here), diacritics play a key role in disambiguation and results show that a straightforward modification to an n-gram tagger can be used to achieve good performance in diacritic error identification without resorting to any specialized machinery. Our method should be applicable to any language where diacritics distribute comparably and perform similar roles of disambiguation.
机译:在本文中,我们解决了变音符号错误检测和还原的问题-识别和纠正文本中丢失的重音的任务。特别是,我们评估了一种基于简单词性标记器的技术与其他已建立的错误检测/恢复方法的性能:单字组频率,决策列表,判别式分类器,基于机器翻译的方法以及字素基于方法。在诸如西班牙语(此处为重点)之类的语言中,变音符号在消除歧义中起关键作用,结果表明,对n-gram标记器的直接修改可用于实现变音符号错误识别中的良好性能,而无需诉诸任何专用机制。我们的方法应该适用于变音符号可比地分布并且执行相似的歧义消除作用的任何语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号