Speech variability: A cross-language study on acoustic variations of speaking versus untrained singing

Hansen John H. L.; Bokshi Marigona; Khorram Soheil

首页> 外文期刊>The Journal of the Acoustical Society of America >Speech variability: A cross-language study on acoustic variations of speaking versus untrained singing

【24h】

Speech variability: A cross-language study on acoustic variations of speaking versus untrained singing

机译：言语变异性：对声音变化的跨语言研究与未经训练的歌唱

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speech production variability introduces significant challenges for existing speech technologies such as speaker identification (SID), speaker diarization, speech recognition, and language identification (ID). There has been limited research analyzing changes in acoustic characteristics for speech produced by untrained singing versus speaking. To better understand changes in speech production of the untrained singing voice, this study presents the first cross-language comparison between normal speaking and untrained karaoke singing of the same text content. Previous studies comparing professional singing versus speaking have shown deviations in both prosodic and spectral features. Some investigations also considered assigning the intrinsic activity of the singing. Motivated by these studies, a series of experiments to investigate both prosodic and spectral variations of untrained karaoke singers for three languages, American English, Hindi, and Farsi, are considered. A comprehensive comparison on common prosodic features, including phoneme duration, mean fundamental frequency (F0), and formant center frequencies of vowels was performed. Collective changes in the corresponding overall acoustic spaces based on the Kullback-Leibler distance using Gaussian probability distribution models trained on spectral features were analyzed. Finally, these models were used in a Gausian mixture model with universal background model SID evaluation to quantify speaker changes between speaking and singing when the audio text content is the same. The experiments showed that many acoustic characteristics of untrained singing are considerably different from speaking when the text content is the same. It is suggested that these results would help advance automatic speech production normalization/compensation to improve performance of speech processing applications (e.g., speaker ID, speech recognition, and language ID).

机译：语音生产变化引入了现有语音技术的重大挑战，例如扬声器识别（SID），扬声器日益增加，语音识别和语言识别（ID）。已经有限的研究分析了通过未经训练的歌曲与口语产生的语音的声学特征的变化。为了更好地了解语音制作的语音生产的变化，本研究介绍了正常口语和未受伤的卡拉OK与同一文本内容的唱歌之间的第一个交叉语言比较。以前的研究比较专业歌唱与口语的研究表明了韵律和光谱特征的偏差。一些调查还考虑分配歌唱的内在活动。考虑了这些研究的动机，考虑了调查未经训练的卡拉OK歌手的韵律和光谱变化的三种语言，美国英语，印地语和波斯语。进行了关于常见韵律特征的全面比较，包括音素持续时间，平均基本频率（F0）以及元音的常规中心频率。分析了使用高斯概率分布模型的基于Kullback-Leibler距离的相应整体声学空间的集体变化进行了分析。最后，这些模型用于Gausian混合模型，通过通用背景模型SID评估，在音频文本内容相同时量化说话和唱歌之间的说话者的变化。实验表明，当文本内容相同时，许多未训练歌曲的声学特征与口语相比不同。建议这些结果将有助于提前自动语音生产标准化/补偿，以提高语音处理应用程序的性能（例如，扬声器ID，语音识别和语言ID）。

著录项

来源
《The Journal of the Acoustical Society of America》 |2020年第2期|共16页
作者
Hansen John H. L.; Bokshi Marigona; Khorram Soheil;
展开▼
作者单位

Univ Texas Dallas Ctr Robust Speech Syst CRSS Robust Speech Technol Lab RSTL Richardson TX 75080 USA;

Univ Texas Dallas Ctr Robust Speech Syst CRSS Robust Speech Technol Lab RSTL Richardson TX 75080 USA;

Univ Texas Dallas Ctr Robust Speech Syst CRSS Robust Speech Technol Lab RSTL Richardson TX 75080 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类声学;
关键词

相似文献

外文文献
中文文献
专利

1. Speech variability: A cross-language study on acoustic variations of speaking versus untrained singing [J] . Hansen John H. L., Bokshi Marigona, Khorram Soheil The Journal of the Acoustical Society of America . 2020,第2期

机译：言语变异性：对声音变化的跨语言研究与未经训练的歌唱
2. How Does Speaking Clearly Influence Acoustic Measures? A Speech Clarity Study Using Long-term Average Speech Spectra in Korean Language [J] . Noh Heil, Lee Dong-Hee Clinical and Experimental Otorhinolaryngology . 2012,第2期

机译：说话如何明显影响声学措施？使用长期平均韩语语音频谱进行语音清晰度研究
3. A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals With Parkinson's Disease [J] . Kim Yunjung, Choi Yaelin Journal of speech, language, and hearing research: JSLHR . 2017,第9期

机译：帕金森病中个体语言可懂度声学预测因子的跨语言研究
4. SPEECH-TO-SINGING SYNTHESIS: CONVERTING SPEAKING VOICES TO SINGING VOICES BY CONTROLLING ACOUSTIC FEATURES UNIQUE TO SINGING VOICES [C] . Takeshi Saitou, Masataka Goto, Masashi Unoki, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics . 2007

机译：演讲歌唱综合：通过控制独特的声音独特的声学功能转换说话的声音来唱歌
5. A perceptual and acoustic study of alaryngeal speech in adult Cantonese-speaking males. [D] . Ng, Manwa Lawrence. 1996

机译：对成年广东话男性中咽喉语言的感知和听觉研究。
6. How Does Speaking Clearly Influence Acoustic Measures? A Speech Clarity Study Using Long-term Average Speech Spectra in Korean Language [O] . Heil Noh, Dong-Hee Lee 2012

机译：说话如何明显影响声学措施？使用长期平均韩语语音频谱进行语音清晰度研究
7. SPEECH-TO-SINGING SYNTHESIS: CONVERTING SPEAKING VOICES TO SINGING VOICES BY CONTROLLING ACOUSTIC FEATURES UNIQUE TO SINGING VOICES [O] . Takeshi Saitou, Masataka Goto 2009

机译：语音合成：通过控制独特的语音特征将语音转换为语音

Speech variability: A cross-language study on acoustic variations of speaking versus untrained singing

摘要

著录项

相似文献

相关主题

期刊订阅