Intelligent Speech Features Mining for Robust Synthesis System Evaluation

机译：智能语音功能挖掘鲁棒合成系统评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speech synthesis evaluation involves the analytical description of useful features, sufficient to assess the performance of a speech synthesis system. Its primary focus is to determine the degree of semblance of synthetic voice to a natural or human voice. The task of evaluation is usually driven by two methods: the subjective and objective methods, which have indeed become a regular standard for evaluating voice quality, but are mostly challenged by high speech variability as well as human discernment errors. Machine learning (ML) techniques have proven to be successful in the determination and enhancement of speech quality. Hence, this contribution utilizes both supervised and unsupervised ML tools to recognize and classify speech quality classes. Data were collected from a listening test (experiment) and the speech quality assessed by domain experts for naturalness, intelligibility, comprehensibility, as well as, tone, vowel and consonant correctness. During the pre-processing stage, a Principal Component Analysis (PCA) identified 4 principal components (intelligibility, naturalness, comprehensibility and tone) - accounting for 76.79% variability in the dataset. An unsupervised visualization using self organizing map (SOM), then discovered five distinct target clusters with high densities of instances, and showed modest correlation between significant input factors. A Pattern recognition using deep neural network (DNN), produced a confusion matrix with an overall performance accuracy of 93.1%, thus signifying an excellent classification system.

机译：语音合成评估涉及有用特征的分析描述，足以评估语音合成系统的性能。其主要焦点是确定合成声音的相似程度，自然或人类的声音。评估的任务通常由两种方法驱动：主观和客观方法确实成为评估语音质量的常规标准，但主要受到高音变异性以及人类辨别错误的挑战。已证明机器学习（ML）技术在言语质量的决心和提高方面取得了成功。因此，这种贡献利用监督和无人监督的ML工具来识别和分类语音质量等级。从聆听测试（实验）收集数据以及由域专家评估的语音质量，用于自然，可懂度，可理解性，以及音调，元音和辅音正确性。在预处理阶段期间，主成分分析（PCA）确定了4个主要成分（可懂度，自然，理解性和音调） - 占数据集中的可变性76.79％。使用自组织地图（SOM）的无监督可视化，然后发现了五个具有高密度的不同的目标集群，并且在显着输入因素之间显示了适度的相关性。使用深神经网络（DNN）的模式识别，产生了93.1％的整体性能精度的混淆矩阵，因此表示出色的分类系统。

著录项

来源
《Language and Technology Conference》|2018年|446p|共16页
会议地点
作者
Moses E. Ekpenyong; Udoinyang G. Inyang; Victor E. Ekong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Deep neural network; Dimension reduction; Machine learning; Pattern recognition; Speech quality evaluation;

机译：深神经网络;减少尺寸;机器学习;模式识别;语音质量评估;

相似文献

外文文献
中文文献
专利

1. Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures [J] . Moore Alastair H., Peso Parada Pablo, Naylor Patrick A. Computer speech and language . 2017,第nova期

机译：语音增强功能可实现强大的自动语音识别：使用基准系统和仪器测量进行评估
2. On the Effect of the Implementation of Human Auditory Systems on Q-Log-Based Features for Robustness of Speech Recognition Against Noise [J] . Pardede Hilman F., Yuliani Asri R., Subekti Agus Journal of Information Recording . 2019,第1期

机译：实施人类听觉系统对基于Q-Log的语音识别抗噪声鲁棒性功能的影响
3. Efficient Noise Robust Feature Extraction Algorithms for Distributed Speech Recognition (DSR) Systems [J] . BOJAN KOTNIK, DAMJAN VLAJ, BOGOMIR HORVAT International journal of speech technology . 2003,第3期

机译：分布式语音识别（DSR）系统的高效噪声稳健特征提取算法
4. Intelligent Speech Features Mining for Robust Synthesis System Evaluation [C] . Moses E. Ekpenyong, Udoinyang G. Inyang, Victor E. Ekong Language and Technology Conference . 2018

机译：智能语音功能挖掘鲁棒合成系统评估
5. Development and evaluation of a robust and intelligent digital control system for a rotary blood pump. [D] . Fu, Minghua. 1998

机译：开发和评估用于旋转式血泵的强大而智能的数字控制系统。
6. Smartphone Application for the Analysis of Prosodic Features in Running Speech with a Focus on Bipolar Disorders: System Performance Evaluation and Case Study [O] . Andrea Guidi, Sergio Salvi, Manuel Ottaviano, 2015

机译：智能手机在以双相情感障碍为重点的跑步语音韵律特征分析中的应用：系统性能评估和案例研究
7. Missing-Feature-Theory-Based Robust Simultaneous Speech Recognition System with Non-clean Speech Acoustic Model [O] . Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, 2009

机译：基于缺失特征理论的鲁棒语音识别模型的鲁棒同时语音识别系统

Intelligent Speech Features Mining for Robust Synthesis System Evaluation

摘要

著录项

相似文献

相关主题

期刊订阅