首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2010 >Building transcribed speech corpora quickly and cheaply for many languages
【24h】

Building transcribed speech corpora quickly and cheaply for many languages

机译:快速,廉价地建立多种语言的转录语音语料库

获取原文

摘要

We present a system for quickly and cheaply building transcribed speech corpora containing utterances from many speakers in a variety of acoustic conditions. The system consists of a client application running on an Android mobile device with an intermittent Internet connection to a server. The client application collects demographic information about the speaker, fetches textual prompts from the server for the speaker to read, records the speaker's voice, and uploads the audio and associated metadata to the server. The system has so far been used to collect over 3000 hours of transcribed audio in 17 languages around the world.
机译:我们提出了一种用于快速而廉价地构建转录语音语料库的系统,该语料库包含来自多种扬声器在各种声学条件下的讲话。该系统由运行在Android移动设备上的客户端应用程序组成,该客户端应用程序与服务器之间存在间歇性Internet连接。客户端应用程序收集有关演讲者的人口统计信息,从服务器获取文本提示以供演讲者阅读,记录演讲者的语音,以及将音频和相关的元数据上传到服务器。迄今为止,该系统已用于收集全球17种语言的3000多个小时的转录音频。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号