Phonetic Segmentation Using Knowledge from Visual and Perceptual Domain

机译：使用视觉和感知域的知识进行语音分割

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Accurate and automatic phonetic segmentation is crucial for several speech based applications such as phone level articulation analysis and error detection, speech synthesis, annotation, speech recognition and emotion recognition. In this paper we examine the effectiveness of using visual features obtained by processing the image spectrogram of a speech utterance, as applied to phonetic segmentation. Further, we propose a mechanism to combine the knowledge from visual and perceptual domains for automatic phonetic segmentation. This process can be considered analogous to manual phonetic segmentation. The technique was evaluated on TIMIT American English Corpus. Experimental results show significant improvements in phonetic segmentation, especially for lower tolerances of 5, 10 and 15ms, with an absolute improvement of 8.29% for TIMIT database for a 10ms tolerance is observed.

机译：准确和自动语音分割对于几个语音基于语音的应用至关重要，例如电话级铰接分析和错误检测，语音合成，注释，语音识别和情感识别。在本文中，我们研究了通过处理语音话语的图像谱图所获得的可视特征的有效性，如图所应用的语音分割。此外，我们提出了一种将知识与自动语音分割的视觉和感知域中的知识结合起来。该过程可以被认为是类似于手动语音分割。该技术是对跨国英语语料库评估的。实验结果显示了语音分割的显着改善，特别是对于5,10和15ms的较低公差，观察到10ms公差的速度数据库的绝对改善为8.29％。

著录项

来源
《International Conference on Text, Speech and Dialogue》|2017年|520p|共9页
会议地点
作者
Bhavik Vachhani; Chitralekha Bhat; Sunil Kopparapu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.1-53;
关键词
Unsupervised phonetic segmentation; Edge detection; Multi-taper; Visual phonetic segmentation;

机译：无监督的语音分割;边缘检测;多锥度;视觉语音分割;

相似文献

外文文献
中文文献
专利

1. Perceptual Doping: An Audiovisual Facilitation Effect on Auditory Speech Processing, From Phonetic Feature Extraction to Sentence Identification in Noise. [J] . Shahram Moradi, Bj?rn Lidestam, Elaine Hoi Ning Ng, Ear and hearing. . 2019,第2期

机译：感知兴奋剂：对听觉语音处理的视听促进效果，从噪音中的语音特征提取到噪声识别。
2. Effects of Perceptual Uncertainty on Arousal and Preference Across Different Visual Domains [J] . Thomas Z. Ramsoy Morten Friis-Olivarius Journal of neuroscience, psychology, and economics . 2012,第4期

机译：感知不确定性对不同视觉域的唤醒和偏好的影响
3. Effects of Perceptual Uncertainty on Arousal and Preference Across Different Visual Domains [J] . Thomas Z. Ramsoy Morten Friis-Olivarius Journal of neuroscience, psychology, and economics . 2012,第4期

机译：感知不确定性对不同视觉域的唤醒和偏好的影响
4. Phonetic Segmentation Using Knowledge from Visual and Perceptual Domain [C] . Bhavik Vachhani, Chitralekha Bhat, Sunil Kopparapu International conference on text, speech and dialogue . 2017

机译：使用来自视觉和感知领域的知识进行语音分割
5. Examining a knowledge domain: Interactive visualization of the Geographic Information Science and Technology Body of Knowledge 1. [D] . Stowell, Marilyn Ruth. 2014

机译：审查知识领域：地理信息科学与技术知识体系的交互式可视化1。
6. Perceptual Doping: An Audiovisual Facilitation Effect on Auditory Speech Processing From Phonetic Feature Extraction to Sentence Identification in Noise [O] . Shahram Moradi, Björn Lidestam, Elaine Hoi Ning Ng, -1

机译：知觉兴奋剂：从语音特征提取到噪声中的句子识别对听觉语音处理的视听促进作用
7. Perceptual relevance of long-domain phonetic dependencies [O] . Nguyen Noël, Fagyal Zsuzsanna, Cole Jennifer 2004

机译：长域语音依存关系的感知相关性
8. Indexing Flowers by Color Names using Domain Knowledge-driven Segmentation. [R] . Das, M., Manmatha, R., Riseman, E. M. 1998

机译：使用域知识驱动的分段按颜色名称索引鲜花。

Phonetic Segmentation Using Knowledge from Visual and Perceptual Domain

摘要

著录项

相似文献

相关主题

期刊订阅