首页> 外文会议>International conference on text, speech and dialogue >Improving a Long Audio Aligner through Phone-Relatedness Matrices for English, Spanish and Basque
【24h】

Improving a Long Audio Aligner through Phone-Relatedness Matrices for English, Spanish and Basque

机译:通过用于英语,西班牙语和巴斯克语的电话相关矩阵来改进长音频对齐器

获取原文

摘要

A multilingual long audio alignment system is presented in the automatic subtitling domain, supporting English, Spanish and Basque. Prerecorded contents are recognized at phoneme level through language-dependent triphone-based decoders. In addition, the transcripts are phonetically translated using grapheme-to-phoneme transcriptors. An optimized version of Hirschberg's algorithm performs an alignment between both phoneme sequences to find matches. The correctly aligned phonemes and their time-codes obtained in the recognition step are used as the reference to obtain near-perfectly aligned subtitles. The performance of the alignment algorithm is evaluated using different non-binary scoring matrices based on phone confusion-pairs from each decoder, on phonological similarity and on human perception errors. This system is an evolution of our previous successful system for long audio alignment.
机译:在自动字幕领域中提供了多语言长音频对齐系统,支持英语,西班牙语和巴斯克语。通过基于语言的基于三音素的解码器,可以在音素级别识别预录的内容。另外,使用音素到音素的笔录来对笔录进行语音翻译。 Hirschberg算法的优化版本在两个音素序列之间执行比对,以找到匹配项。在识别步骤中获得的正确对齐的音素及其时间代码将用作参考,以获取接近完美对齐的字幕。基于来自每个解码器的电话混淆对,语音相似性和人类感知错误,使用不同的非二进制评分矩阵评估对齐算法的性能。该系统是我们先前成功实现长音频对齐的系统的发展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号