首页> 外文会议>International Conference on Speech Database and Assessments >Collection and annotation of Malay conversational speech corpus
【24h】

Collection and annotation of Malay conversational speech corpus

机译:Malay对话语音语料库的收集和注释

获取原文

摘要

We report the development of a Malay conversational speech corpus as part of our research in spontaneous conversational speech LVCSR. This corpus development effort is the collaboration between NTU and USM. The goal is to collect, transcribe, and annotate 50 hours of conversational Malay speech. The conversation is recorded from both close-talk and telephone channels, and both speakers' utterances are kept in separate tracks. Besides the word transcription, we also annotate linguistics phenomena such as fillers and disfluencies. To date, 20 hours have been recorded, transcribed and analyzed. The details of our analysis will be presented in this report.
机译:我们报告了马来说会话语音组织的发展,作为我们在自发性对话讲话LVCSR中的研究的一部分。 该语料库开发工作是NTU和USM之间的合作。 目标是收集,转录和诠释50小时的会话马来语演讲。 谈话从闭合电话和电话渠道记录,两个扬声器的话语都保持在单独的轨道中。 除了转录这个词之外,我们还向语言学现象(如填充物和混乱)注释。 迄今为止,已记录了20小时,转录和分析。 我们的分析细节将在本报告中提出。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号