Semi-automatic acoustic model generation from large unsynchronized audio and text chunks

机译：从大型不同步的音频和文本块生成半自动声学模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper an effective technique to train an acoustic model from large and unsynchronized audio and text chunks is presented. Given such a speech corpus, an algorithm to automatically segment each chunk into smaller fragments and to synchronize those to the corresponding text is defined. These smaller fragments are more suitable to be used in standard model training algorithms for usage in automatic speech recognition systems. The proposed approach is particularly suitable to bootstrap language models without relying neither on specialized training material nor borrowing from models trained for other similar languages. Extensive experimentation using the CMU Sphinx 4 recognizer and the SphinxTrain model generator in a setting designed for large-vocabulary continuous speech recognition shows the effectiveness of the approach.

机译：在本文中，提出了一种有效的技术，该技术可以从大型且不同步的音频和文本块中训练声学模型。给定这样的语音语料库，定义了一种算法，该算法可自动将每个块分割成较小的片段，并将其与相应的文本同步。这些较小的片段更适合用于自动语音识别系统中的标准模型训练算法中。所提出的方法特别适合于引导语言模型，而不依赖于专门的培训材料，也无需从针对其他类似语言训练的模型中借用。在为大词汇量连续语音识别而设计的环境中，使用CMU Sphinx 4识别器和SphinxTrain模型生成器进行了广泛的实验，证明了该方法的有效性。

著录项

来源
《Annual conference of the International Speech Communication Association;INTERSPEECH 2011》|2011年|p.1692-1695|共4页
会议地点
作者
Michele Alessandrini; Giorgio Biagetti; Alessandro Curzi; Claudio Turchetti;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
speech recognition; acoustic model; model bootstrapping; automatic segmentation;

机译：语音识别;声学模型模型自举;自动分割;
入库时间 2022-08-26 15:06:02

相似文献

外文文献
中文文献
专利

1. A Rule-Based Model and Genetic Algorithm Combination for Persian Text Chunking [J] . Samira Noferesti, Mehrnoush Shamsfard International journal of computers and their applications . 2014,第2期

机译：波斯文本分块的基于规则的模型和遗传算法组合
2. Chunk Parsing and Entity Relation Extracting to Chinese Text by Using Conditional Random Fields Model [J] . Junhua Wu, Longxia Liu Journal of Intelligent Learning Systems and Applications . 2010,第3期

机译：利用条件随机场模型对中文文本进行分块解析和实体关系提取
3. Training Deep Bidirectional LSTM Acoustic Model for LVCSR by a Context-Sensitive-Chunk BPTT Approach [J] . Kai Chen, Qiang Huo Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第7期

机译：通过上下文敏感块BPTT方法训练LVCSR的深度双向LSTM声学模型
4. Gut, Besser, Chunker - Selecting the Best Models for Text Chunking with Voting [C] . Balazs Indig, Istvan Endredy International conference on intelligent text processing and computational linguistics . 2018

机译：Gut，Besser，Chunker-通过投票选择最佳文本分块模型
5. A system for acoustic chord transcription and key extraction from audio using hidden Markov models trained on synthesized audio. [D] . Lee, Kyogu. 2008

机译：一种使用在合成音频上训练的隐马尔可夫模型从音频进行和弦转录和音调提取的系统。
6. Semi-automatic scene generation using the Digital Anatomist Foundational Model. [O] . B. A. Wong, C. Rosse, J. F. Brinkley 1999

机译：使用数字解剖学家基础模型进行半自动场景生成。
7. Computer-Aided Semi-Automatic Generation Method of Animation Image and Text Split Mirror [O] . Xiaoyu Liu, Deng Pan 2021

机译：动画图像和文本拆分镜的计算机辅助半自动生成方法

Semi-automatic acoustic model generation from large unsynchronized audio and text chunks

摘要

著录项

相似文献

相关主题

期刊订阅