首页> 外文会议>Conference on Speech Technology and Human - Computer Dialogue >Multilingual query by example spoken term detection for under-resourced languages
【24h】

Multilingual query by example spoken term detection for under-resourced languages

机译:多语言查询通过示例口语术语检测,用于资源介绍语言

获取原文

摘要

We propose a query-by-example approach to multilingual Spoken Term Detection for under-resourced languages based on Automatic Speech Recognition. The approach overcomes the main difficulties met under these conditions, i.e., providing a new method for building multilingual acoustic models with few annotated data and searching in approximate Automatic Speech Recognition transcriptions providing high scalability. The acoustic models are obtained by adapting well-trained phonemes to the ones from the envisaged languages. The mapping is made according to International Phonetic Alphabet phoneme classification and a confusion matrix. The weighting of query length and alignment spread are incorporated in the Dynamic Time Warping technique to improve the searching method. Experimental validation was conducted on a standard data set consisting of 3 hours of mixed African languages. The recorded speech has telephonic quality and it is a mix of read and spontaneous speech.
机译:我们提出了一种基于自动语音识别的资源低调语言的多语言语口语术语检测的查询方法。该方法克服了在这些条件下满足的主要困难,即提供了一种用于构建多语言声学模型的新方法,其中有一些注释数据和在提供高可扩展性的近似自动语音识别转录中搜索。声学模型是通过将训练有素的音素与设想的语言中的训练有素的音素进行。映射根据国际语音字母音素分类和混淆矩阵进行。查询长度和对准扩展的加权结合在动态时间翘曲技术中以改善搜索方法。在由3小时混合的非洲语言组成的标准数据集上进行了实验验证。记录的语音具有电话质量,它是一种读取和自发的语音的混合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号