首页> 外文OA文献 >Detecting acronyms from capital letter sequences in Spanish
【2h】

Detecting acronyms from capital letter sequences in Spanish

机译:从西班牙语的大写字母序列中检测首字母缩写词

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper presents an automatic strategy to decide how to pronounce a Capital Letter Sequence (CLS) in a Text to Speech system (TTS). If CLS is well known by the TTS, it can be expanded in several words. But when the CLS is unknown, the system has two alternatives: spelling it (abbreviation) or pronouncing it as a new word (acronym). In Spanish, there is a high relationship between letters and phonemes. Because of this, when a CLS is similar to other words in Spanish, there is a high tendency to pronounce it as a standard word. This paper proposes an automatic method for detecting acronyms. Additionaly, this paper analyses the discrimination capability of some features, and several strategies for combining them in order to obtain the best classifier. For the best classifier, the classification error is 8.45%. About the feature analysis, the best features have been the Letter Sequence Perplexity and the Average N-gram order.
机译:本文提出了一种自动策略,用于决定如何在文本语音系统(TTS)中发音大写字母序列(CLS)。如果TTS众所周知CLS,则可以用几个词来扩展它。但是,当CLS未知时,系统有两种选择:将其拼写(缩写)或将其发音为新词(缩写)。在西班牙语中,字母和音素之间有很高的关系。因此,当CLS与西班牙语中的其他单词相似时,很容易将其发音为标准单词。本文提出了一种自动检测缩写词的方法。另外,本文分析了某些特征的辨别能力,以及将它们组合以获得最佳分类器的几种策略。对于最佳分类器,分类误差为8.45%。关于特征分析,最好的特征是字母序列困惑度和平均N元语法顺序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号