首页> 外文期刊>IEEE transactions on audio, speech and language processing >An Acoustic Measure for Word Prominence in Spontaneous Speech
【24h】

An Acoustic Measure for Word Prominence in Spontaneous Speech

机译:自发性语音中单词突出的声音测量

获取原文
获取原文并翻译 | 示例

摘要

An algorithm for automatic speech prominence detection is reported in this paper. We describe a comparative analysis on various acoustic features for word prominence detection and report results using a spoken dialog corpus with manually assigned prominence labels. The focus is on features such as spectral intensity and speech rate that are directly extracted from speech based on a correlation-based approach without requiring explicit linguistic or phonetic knowledge. Additionally, various pitch-based measures are studied with respect to their discriminating ability for prominence detection. A parametric scheme for modeling pitch plateau is proposed and this feature alone is found to outperform the traditional local pitch statistics. Two sets of experiments are used to explore the usefulness of the acoustic score generated using these features. The first set focuses on a more traditional way of word prominence detection based on a manually-tagged corpus. A 76.8% classification accuracy was achieved on a corpus of role-playing spoken dialogs. Due to difficulties in manually tagging speech prominence into discrete levels (categories), the second set of experiments focuses on evaluating the score indirectly. Specifically, through experiments on the Switchboard corpus, it is shown that the proposed acoustic score can discriminate between content word and function words in a statistically significant way. The relation between speech prominence and content/function words is also explored. Since prominent words tend to be predominantly content words, and since content words can be automatically marked from text-derived part of speech (POS) information, it is shown that the proposed acoustic score can be indirectly cross-validated through POS information
机译:本文报道了一种自动语音突出检测算法。我们描述了针对单词突出检测的各种声学特征的比较分析,并使用带有手动分配的突出标签的口语对话语料库报告结果。重点关注的是诸如频谱强度和语音速率之类的功能,这些功能是基于基于相关性的方法直接从语音中提取的,而无需明确的语言或语音知识。此外,针对音高检测的区分能力,研究了各种基于音高的测量。提出了一种用于对音高高原进行建模的参数化方案,并且仅此功能就优于传统的本地音高统计数据。使用两组实验来探索使用这些功能生成的声音评分的有用性。第一组专注于基于手动标记的语料库的更传统的单词突出检测方法。通过角色扮演语音对话的语料库,分类准确率达到了76.8%。由于难以手动将语音突出标记为离散级别(类别),因此第二组实验着重于间接评估分数。具体来说,通过对“总机”语料库的实验表明,所提出的声学得分可以以统计学上显着的方式区分内容词和功能词。还探讨了语音突出与内容/功能词之间的关系。由于突出显示的单词倾向于主要是内容单词,并且由于可以从文本衍生的语音(POS)信息中自动标记内容单词,因此表明可以通过POS信息间接交叉验证建议的声学得分

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号