首页> 外文会议> >The development of file formats for very large speech corpora: SPHERE and SHORTEN
【24h】

The development of file formats for very large speech corpora: SPHERE and SHORTEN

机译:大型语音语料库文件格式的开发:SPHERE和SHORTEN

获取原文

摘要

The performance of large vocabulary speech recognition systems is currently thought to be limited by the size of the corpus used to train the recognition system. Hence several very large speech corpora have been created recently and many more are planned. A significant problem in the generation of these corpora is the definition of their format to minimize distribution costs and maximize ease of use. This paper describes the development of a "standard" lossless compressed waveform file format which minimizes the media required for corpora distribution while maximizing accessibility. This paper contains two primary contributions: 1) The use of a "standard" file format for speech corpora which supports embedded compression and the development of a software interface toolkit which supports automatic waveform compression/decompression; 2) The use of lossless data compression for speech corpora. This task differs from mainstream speech coding in that the compression must be fast and lossless. Fast approximations to the standard techniques of linear prediction and residual coding have been developed and are employed.
机译:当前认为大词汇量语音识别系统的性能受到用于训练识别系统的语料库大小的限制。因此,最近已经创建了几个非常大的语音语料库,并且还计划了更多。这些语料库的生成中的一个重要问题是其格式的定义,以最小化发行成本并最大化易用性。本文介绍了“标准”无损压缩波形文件格式的开发,该文件格式可最大程度地减少语料库分发所需的媒体,同时最大程度地提高可访问性。本文包含两个主要贡献:1)对支持嵌入式压缩的语音语料库使用“标准”文件格式,并开发支持自动波形压缩/解压缩的软件接口工具包; 2)对语音语料库使用无损数据压缩。此任务与主流语音编码的区别在于压缩必须快速且无损。已经开发并采用了对线性预测和残差编码的标准技术的快速近似。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号