首页> 外文会议>European conference on speech communication and technology >Preliminary experiments on language identification using broadcast news recordings
【24h】

Preliminary experiments on language identification using broadcast news recordings

机译:广播新闻录制的语言识别初步实验

获取原文
获取外文期刊封面目录资料

摘要

This article presents experiments on language identification using Broadcast News recordings, for which large amounts of data are available. The system uses a Broadcast News partitioner developed by LIMSI to extract the speech segments from raw signals. These segments are then transcribed using a language-independent HMM acoustic model. Phonotactic models are trained for each language, and used to score the transcription of the test signals. Training was conducted on recordings from three monolingual radios (about 17h of signal per language) and tests were made on signals from other radios. We also investigated a rejection strategy to improve the identification results. Without any rejection, the error rates range from 13.8% (5s segments) to 4.3% (45 s segments). Rejecting 1/3 of the data improves these rates by 78% for 10s segments.
机译:本文介绍了使用广播新闻录制的语言识别实验,为此有大量数据。该系统使用LIMSI开发的广播新闻分区,以从原始信号中提取语音段。然后使用独立于语言的HMM声学模型转录这些段。对数学模型进行了针对每种语言培训,并用于对测试信号的转录进行评分。在三种单声道无线电(每种语言信号的约17小时)上进行训练,并对来自其他无线电的信号进行测试。我们还调查了拒绝策略以改善鉴定结果。如果没有任何拒绝,误差率范围为13.8%(5S段)至4.3%(45段)。拒绝数据的1/3提高了这些率为10S段的78%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号