【24h】

A Tool for Cutting Large Speech Corpora: HCI4CS

机译:切割大语音语料库的工具:HCI4CS

获取原文

摘要

There are two methods to cut large speech corpora, include traditional manual segmentation and machine automatic segmentation. The quality of segmentation can be controlled easily using traditional manual segmentation. However, the shortcomings of manual segmentation were also obviously such as inefficiency, high cost. As we all know, the method of machine automatic segmentation has the advantage of high efficiency, but the fussy work to find cutting error can't be omitted. Thus, this paper developed a tool of human-computer interaction for cutting speech corpora (HCI4CS), which provides segment algorithm, parameter to control, modifying the error of automatic segmentation results and generates labeling files for HTK toolkit. The research object was one thousand speeches of Primi. Using HCI4CS, a person with low cognitive competence about cutting speech corpora can achieve nearly one hundred percent accuracy.
机译:有两种方法来削减大型语音语料库,包括传统的手动分段和机器自动分割。可以使用传统的手动分段轻松控制分割质量。然而,手动分割的缺点也明显是低效率,高成本。众所周知,机器自动分割方法具有高效率的优点,但无法省略发现切割错误的挑战工作。因此,本文开发了一种用于切割语音语料库(HCI4C)的人机交互的工具,它提供段算法,控制参数,修改自动分段结果的错误,并为HTK工具包生成标签文件。研究对象是Primi的一千个演讲。使用HCI4CS,关于切割语音集团的认知能力低的人可以达到近百分之百的准确度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号