首页> 外文会议>International Conference on Text, Speech and Dialogue >Correction of Prosodic Phrases in Large Speech Corpora
【24h】

Correction of Prosodic Phrases in Large Speech Corpora

机译:大型语音集团中韵律短语的纠正

获取原文

摘要

Nowadays, in many speech processing tasks, such as speech recognition and synthesis, really large speech corpora are utilized. These speech corpora usually contain several hours of speech or even more. To achieve possibly best results, an appropriate annotation of the recorded utterances is often necessary. This paper is focused on problems related to the prosodic annotation of the Czech speech corpora. In the Czech language, the utterances are supposed to be split by pauses into so-called prosodic clauses containing one or more prosodic phrases. The types of particular phrases are linked to their last prosodic words corresponding to various functionally involved prosodemes. The clause/phrase structure is substantially determined by the sentence composition. However, in real speech data, different prosodeme type or even phrase/clause borders can be present. This paper deals with 2 basic problems: the correction of the improper prosodeme/phrase type and the detection of new phrase borders. For both tasks, we proposed new procedures utilizing hidden Markov models. Experiments were performed on 4 large speech corpora recorded by professional speakers for the purpose of speech synthesis. These experiments were limited to the declarative sentences. The results were successfully verified by listening tests.
机译:如今,在许多语音处理任务中,例如语音识别和合成,使用真正的大语音语料库。这些演讲语料库通常包含几个小时的言论或者更多。为了实现可能的最佳结果,通常需要对记录的话语的适当注释。本文专注于与捷克语音集团的韵律注释有关的问题。在捷克语中,话语应该通过暂停进入含有一个或多个韵律短语的所谓韵律子句。特定短语的类型与对应于各种功能涉及的次级次数相对应的最后一个韵律词语。子句/短语结构基本上由句子组成决定。但是,在实际语音数据中,可以存在不同的roademe类型甚至短语/条款边框。本文涉及2个基本问题:校正不当级粒度/短语类型和新短语边界的检测。对于两个任务,我们提出了利用隐藏的马尔可夫模型的新程序。对由专业演讲者记录的4个大型语音集团进行实验,用于语音合成的目的。这些实验仅限于陈述性句子。通过听力测试成功验证了结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号