首页> 外文会议>2013 7th Conference on Speech Technology and Human - Computer Dialogue >Multilingual query by example spoken term detection for under-resourced languages
【24h】

Multilingual query by example spoken term detection for under-resourced languages

机译:通过示例口语词检测对资源不足的语言进行多语言查询

获取原文
获取原文并翻译 | 示例

摘要

We propose a query-by-example approach to multilingual Spoken Term Detection for under-resourced languages based on Automatic Speech Recognition. The approach overcomes the main difficulties met under these conditions, i.e., providing a new method for building multilingual acoustic models with few annotated data and searching in approximate Automatic Speech Recognition transcriptions providing high scalability. The acoustic models are obtained by adapting well-trained phonemes to the ones from the envisaged languages. The mapping is made according to International Phonetic Alphabet phoneme classification and a confusion matrix. The weighting of query length and alignment spread are incorporated in the Dynamic Time Warping technique to improve the searching method. Experimental validation was conducted on a standard data set consisting of 3 hours of mixed African languages. The recorded speech has telephonic quality and it is a mix of read and spontaneous speech.
机译:我们提出了一种基于示例的方法,用于基于自动语音识别的资源不足语言的多语言口语检测。该方法克服了在这些条件下遇到的主要困难,即,提供了一种新的方法,该方法用于建立带有少量注释数据的多语言声学模型,并在近似的自动语音识别转录中进行搜索,以提供较高的可扩展性。声学模型是通过将训练有素的音素改编成设想的语言中的音素而获得的。根据国际语音字母音素分类和混淆矩阵进行映射。将查询长度的权重和对齐方式的扩展纳入动态时间规整技术中,以改进搜索方法。对包含3小时非洲混合语言的标准数据集进行了实验验证。录制的语音具有电话质量,是阅读语音和自发语音的混合体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号