首页> 外文期刊>Computer speech and language >Anomaly-based annotation error detection in speech-synthesis corpora
【24h】

Anomaly-based annotation error detection in speech-synthesis corpora

机译:语音合成语料库中基于异常的注释错误检测

获取原文
获取原文并翻译 | 示例

摘要

We investigate the problem of automatic detection of annotation errors in single-speaker read-speech corpora used for speech synthesis. For the purpose of annotation error detection, we adopt an anomaly detection framework in which correctly annotated words are considered as normal examples on which the detection methods are trained. Misannotated words are then taken as anomalous examples which do not conform to normal patterns of the trained detection models. We propose and evaluate several anomaly detection models - Gaussian distribution based detectors, Grubbs’ test based detector, and one-class support vector machine based detector. Word-level feature sets including basic features derived from forced alignment and various acoustic, spectral, phonetic, and positional features are examined to find an optimal set of features for each anomaly detector. The results with F1 score being almost 89% show that anomaly detection could help detecting annotation errors in read-speech corpora for speech synthesis. Furthermore, dimensionality reduction techniques are also examined to automatically reduce the number of features used to describe the annotated words. We show that the automatically reduced feature sets achieve statistically similar results as the hand-crafted feature sets. We also conducted additional experiments to investigate both robustness of the proposed anomaly detection framework with respect to particular data sets used for development and evaluation and the influence of the number of examples needed for anomaly detection. We show that a reasonably good detection performance could be reached with using significantly fewer examples during the detector development phase. We also propose a concept of a voting detector - a combination of anomaly detectors in which each “single” detector “votes” on whether or not a testing word is annotated correctly, and the final decision is then made by aggregating the votes. Our results show that the voting detector has a potential to overcome each of the single anomaly detectors. Furthermore, we compare the proposed anomaly detection framework to a classification-based approach (which, unlike anomaly detection, needs to use anomalous examples during training) and we show that both approaches lead to statistically comparable results when all available anomalous examples are utilized during detector/classifier development. However, when a smaller number of anomalous examples are used, the proposed anomaly detection framework clearly outperforms the classification-based approach. A final listening test showed the effectiveness of the proposed anomaly-based annotation error detection for improving the quality of synthetic speech.
机译:我们研究了用于语音合成的单人阅读语音语料库中注释错误的自动检测问题。为了进行注释错误检测,我们采用了一种异常检测框架,在该框架中,正确注释的单词被视为训练检测方法的常规示例。然后将错误标注的单词作为不符合训练后的检测模型正常模式的异常示例。我们提出并评估了几种异常检测模型-基于高斯分布的检测器,基于Grubbs测试的检测器和基于一类支持向量机的检测器。单词级特征集包括从强制对齐中得出的基本特征以及各种声学,频谱,语音和位置特征,将为每个异常检测器找到最佳的特征集。 F1分数接近89%的结果表明,异常检测可以帮助检测语音合成的阅读语音语料库中的注释错误。此外,还研究了降维技术以自动减少用于描述带注释单词的特征的数量。我们表明,自动精简的功能集在统计上与手工制作的功能集相似。我们还进行了额外的实验,以研究拟议的异常检测框架相对于用于开发和评估的特定数据集的稳健性,以及异常检测所需的示例数量的影响。我们表明,在检测器开发阶段使用很少的示例可以达到相当好的检测性能。我们还提出了一种投票检测器的概念-异常检测器的组合,其中每个“单个”检测器对测试词是否正确注解进行“投票”,然后通过汇总选票做出最终决定。我们的结果表明,投票检测器具有克服单个异常检测器的潜力。此外,我们将提议的异常检测框架与基于分类的方法(与异常检测不同,在训练期间需要使用异常示例)进行比较,并且当在检测器中使用所有可用的异常示例时,我们表明两种方法均会产生统计上可比的结果/分类器开发。但是,当使用较少数量的异常示例时,提出的异常检测框架明显优于基于分类的方法。最终的听力测试显示了所提出的基于异常的注释错误检测对于提高合成语音质量的有效性。

著录项

  • 来源
    《Computer speech and language》 |2017年第11期|1-35|共35页
  • 作者

    Matousek J.; Tihelka Daniel;

  • 作者单位

    Department of Cybernetics, New Technology for the Information Society (NTIS), Faculty of Applied Sciences, University of West Bohemia, Univerzitní 8, Pilsen, Czech Republic;

    Department of Cybernetics, New Technology for the Information Society (NTIS), Faculty of Applied Sciences, University of West Bohemia, Univerzitní 8, Pilsen, Czech Republic;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Annotation error detection; Anomaly detection; Read speech corpora; Speech synthesis;

    机译:注释错误检测;异常检测;阅读语音语料库;语音合成;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号