首页> 外文会议>Chinese Spoken Language Processing; Lecture Notes in Artificial Intelligence; 4274 >Development of Multi-lingual Spoken Corpora of Indian Languages
【24h】

Development of Multi-lingual Spoken Corpora of Indian Languages

机译:印度语多语种语料库的发展

获取原文
获取原文并翻译 | 示例

摘要

This paper describes a recently initiated effort for collection and transcription of read as well as spontaneous speech data in four Indian languages. The completed preparatory work include the design of phonetically rich sentences, data acquisition setup for recording speech data over telephone channel, a Wizard of Oz setup for acquiring speech data of a spoken dialogue of a caller with the machine in the context of a remote information retrieval task. An account of care taken to collect speech data that is as close to real world as possible is given. The current status of the programme and the set of actions planned to achieve the goal is given.
机译:本文介绍了最近启动的以四种印度语言收集和转录阅读以及自发语音数据的工作。完成的准备工作包括语音丰富的句子的设计,用于在电话信道上记录语音数据的数据获取设置,用于在远程信息检索的背景下获取呼叫者与机器的口语对话的语音数据的Oz向导设置任务。给出了收集尽可能接近真实世界的语音数据时要注意的事项。给出了计划的当前状态以及为实现目标而计划采取的一系列行动。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号