Intelligibility Assessment of the De-Identified Speech Obtained Using Phoneme Recognition and Speech Synthesis Systems

机译：使用音素识别和语音合成系统获得的去识别语音的可智能性评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The paper presents and evaluates a speaker de-identification technique using speech recognition and two speech synthesis techniques. The phoneme recognition system is built using HMM-based acoustical models of context-dependent diphone speech units, and two different speech synthesis systems (diphone TD-PSOLA-based and HMM-based) are employed for re-synthesizing the recognized sequences of speech units. Since the acoustical models of the two speech synthesis systems are assumed to be completely independent of the input speaker's voice, the highest level of input speaker de-identification is ensured. The proposed de-identification system is considered to be language dependent, but is, however, vocabulary and speaker independent since it is based mainly on acoustical modelling of the selected diphone speech units. Due to the relatively simple computing methods, the whole de-identification procedure runs in real-time. The speech outputs are compared and assessed by testing the intelligibility of the re-synthesized speech from different points of view. The assessment results show interesting variabilities of the evaluators' transcriptions depending on the input speaker, the synthesis method applied and the evaluators capabilities. But in spite of the relatively high phoneme recognition error rate (approx. 19%), the re-synthesized speech is in many cases still fully intelligible.

机译：本文使用语音识别和两个语音合成技术来评估扬声器去识别技术。音素识别系统是使用基于HMM的声学模型构建的上下文依赖的DIPHONE语音单元，并采用两个不同的语音合成系统（基于DIPHONE TD-PSOLA和基于HMM的）来重新合成识别的语音单元序列。由于假设两个语音合成系统的声学模型完全独立于输入扬声器的语音，因此确保了输入扬声器去识别的最高级别。所提出的去识别系统被认为是依赖的语言，但是，词汇和扬声器独立，因为它主要基于所选的Diphone语音单元的声学建模。由于计算方法相对简单，整个去识别过程实时运行。通过测试来自不同观点的重新合成语音的可懂度来比较和评估语音输出。评估结果表明评估员转录的有趣可变性，这取决于输入扬声器，合成方法应用和评估员能力。但尽管音素识别错误率相对较高（约19％），重新合成的演讲是在许多情况下仍然完全可理解。

著录项

来源
《International Conference on Text, Speech and Dialogue》|2014年||共8页
会议地点
作者
Tadej Justin; France Mihelic; Simon Dobrisek;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.1-53;
关键词
Voice de-identification; Phoneme recognition; Speech synthesis; Diphone speech units; HMM modelling; Intelligibility evaluation;

机译：语音去识别;音素识别;语音合成;迪维言语单位;HMM造型;可懂度评估;

相似文献

外文文献
中文文献
专利

1. Maximizing Phoneme Recognition Accuracy for Enhanced Speech Intelligibility in Noise [J] . Petkov P. N., Henter G. E., Kleijn W. B. Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第5期

机译：最大化音素识别精度以增强语音中的语音清晰度
2. Estimation of Speech Intelligibility Using Speech Recognition Systems [J] . Yusuke TAKANO, Kazuhiro KONDO IEICE transactions on information and systems . 2010,第12期

机译：使用语音识别系统估算语音清晰度
3. Estimation of Speech Intelligibility Using Speech Recognition Systems [J] . Yusuke TAKANO, Kazuhiro KONDO IEICE Transactions on Information and Systems . 2010,第12期

机译：使用语音识别系统估算语音清晰度
4. Intelligibility Assessment of the De-Identified Speech Obtained Using Phoneme Recognition and Speech Synthesis Systems [C] . Tadej Justin, France Mihelic, Simon Dobrisek International conference on text, speech and dialogue . 2014

机译：使用音素识别和语音合成系统获得的去识别语音的清晰度评估
5. Objective speech intelligibility assessment using speech recognition and bigram statistics with application to low bit-rate codec evaluation [D] . Teng, Yan 2006

机译：使用语音识别和双字母组统计的客观语音清晰度评估及其在低比特率编解码器评估中的应用
6. Predicting Speech Recognition Using the Speech Intelligibility Index and Other Variables for Cochlear Implant Users [O] . Sungmin Lee, Lisa Lucks Mendel, Gavin M. Bidelman -1

机译：使用语音可懂度指数和其他变量为人工耳蜗用户预测语音识别
7. Phoneme Compression: processing of the speech signal and effects on speech intelligibility in hearing-Impaired listeners [O] . Goedegebure A. (Andre) 2005

机译：音素压缩：语音信号的处理及其对听障听众的语音清晰度的影响
8. Simulation and Evaluation of Phonetic Speech Recognition Techniques. Volume II. Segmentation of Continuous Speech into Phonemes [R] . Otten, K. W. 1964

机译：语音识别技术的仿真与评估。第二卷。将连续语音分割成音素

Intelligibility Assessment of the De-Identified Speech Obtained Using Phoneme Recognition and Speech Synthesis Systems

摘要

著录项

相似文献

相关主题

期刊订阅