Visual speech discrimination and identification of natural and synthetic consonant stimuli

Benjamin T. Files; Bosco S. Tjan; Jintao Jiang; Lynne E. Bernstein

首页> 外文期刊>Frontiers in Psychology >Visual speech discrimination and identification of natural and synthetic consonant stimuli

【24h】

Visual speech discrimination and identification of natural and synthetic consonant stimuli

机译：视觉语音识别以及自然和合成辅音刺激的识别

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

From phonetic features to connected discourse, every level of psycholinguistic structure including prosody can be perceived through viewing the talking face. Yet a longstanding notion in the literature is that visual speech perceptual categories comprise groups of phonemes (referred to as visemes), such as /p, b, m/ and /f, v/, whose internal structure is not informative to the visual speech perceiver. This conclusion has not to our knowledge been evaluated using a psychophysical discrimination paradigm. We hypothesized that perceivers can discriminate the phonemes within typical viseme groups, and that discrimination measured with d-prime (d’) and response latency is related to visual stimulus dissimilarities between consonant segments. In Experiment 1, participants performed speeded discrimination for pairs of consonant-vowel spoken nonsense syllables that were predicted to be same, near, or far in their perceptual distances, and that were presented as natural or synthesized video. Near pairs were within-viseme consonants. Natural within-viseme stimulus pairs were discriminated significantly above chance (except for /k/-/h/). Sensitivity (d’) increased and response times decreased with distance. Discrimination and identification were superior with natural stimuli, which comprised more phonetic information. We suggest that the notion of the viseme as a unitary perceptual category is incorrect. Experiment 2 probed the perceptual basis for visual speech discrimination by inverting the stimuli. Overall reductions in d’ with inverted stimuli but a persistent pattern of larger d’ for far than for near stimulus pairs are interpreted as evidence that visual speech is represented by both its motion and configural attributes. The methods and results of this investigation open up avenues for understanding the neural and perceptual bases for visual and audiovisual speech perception and for development of practical applications such as visual lipreading/speechreading speech synthesis.

机译：从语音功能到关联的话语，可以通过查看说话的面孔来感知心理语言结构的各个层次，包括韵律。然而，文献中长期存在的观念是，视觉语音的感知类别包括音素组（称为音位），例如/ p，b，m /和/ f，v /，其内部结构对视觉语音没有帮助。感知器。据我们所知，该结论尚未使用心理物理歧视范例进行评估。我们假设感知者可以区分典型的韵位组中的音素，并且以d-素数（d’）和响应潜伏期测得的辨别力与辅音段之间的视觉刺激差异有关。在实验1中，参与者对感知语音距离相同，相近或相距遥远且以自然或合成视频形式呈现的辅音元音无意义音节进行快速判别。近对是语音内辅音。明显高于机会的情况下，区分了自然的视觉内刺激对（/ k /-/ h /除外）。灵敏度（d’）随距离增加而响应时间减少。歧视和识别优于自然刺激，其中包含更多的语音信息。我们建议视位素为单一感知类别的概念是不正确的。实验2通过反转刺激探索了视觉语音辨别的感知基础。 d'的总体减少与反向刺激相反，但持续存在的d'远距离比近距离刺激对更大的持续模式被解释为视觉语音由其运动和结构属性表示的证据。这项研究的方法和结果为理解视觉和视听语音感知的神经和知觉基础以及实际应用的发展开辟了途径，例如视觉唇读/语音朗读语音合成。

著录项

来源
《Frontiers in Psychology》 |2015年第4期|共页
作者
Benjamin T. Files; Bosco S. Tjan; Jintao Jiang; Lynne E. Bernstein;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类心理学;
关键词
visual speech perceptionvisemeslipreading/speechreadingdiscriminationsynthetic visual speechmotion capturemultisensory perceptionaudiovisual speech perception;

机译：视觉语音感知视觉上的口语阅读/语音阅读歧视合成的视觉语音动作捕捉多感官感知视听语音感知;
入库时间 2022-08-18 18:45:19

相似文献

外文文献
中文文献
专利

1. Comparison of Gated Audiovisual Speech Identification in Elderly Hearing Aid Users and Elderly Normal-Hearing Individuals: Effects of Adding Visual Cues to Auditory Speech Stimuli [J] . Shahram Moradi, Bj?rn Lidestam, Jerker R?nnberg Trends in Hearing . 2016,第2期

机译：老年人的助听器使用者和听力正常的老年人的门控视听语音识别的比较：在听觉语音刺激中添加视觉提示的效果
2. Effect of age on silent gap discrimination in synthetic speech stimuli. [J] . Lister J, Tarver K Journal of speech, language, and hearing research: JSLHR . 2004,第2期

机译：年龄对合成语音刺激中无声间隙识别的影响。
3. Visual speech alters the discrimination and identification of non-intact auditory speech in children with hearing loss [J] . Jerger Susan, Damian Markus F., McAlpine Rachel P., International journal of pediatric otorhinolaryngology . 2017,第期

机译：视觉演讲改变了听力损失儿童非完整听觉言论的歧视和鉴定
4. Assessment of Speech Discrimination Based on the Event-Related Potentials to the Visual Stimuli [C] . Morikawa Koji, Kozuka Kazuki, Adachi Shinobu 2011 IEEE International Conference on Communications . 2011

机译：基于与视觉刺激有关的事件电位的语音歧视评估
5. Improved discrimination of visual stimuli following repetitive transcranial magnetic stimulation. [D] . Waterston, Michael L. 2010

机译：重复经颅磁刺激后视觉刺激的区分度提高。
6. Visual speech discrimination and identification of natural and synthetic consonant stimuli [O] . Benjamin T. Files, Bosco S. Tjan, Jintao Jiang, -1

机译：视觉语音识别以及自然和合成辅音刺激的识别
7. Visual Speech Discrimination and Identification of Natural and Synthetic Consonant Stimuli [O] . Benjamin T. Files, Bosco eTjan, Jintao eJiang, 2015

机译：视觉语音识别与自然和合成辅音刺激的识别

Visual speech discrimination and identification of natural and synthetic consonant stimuli

摘要

著录项

相似文献

相关主题

期刊订阅