Voiced/unvoiced pattern-based duration modeling for language identification

机译：基于浊音/清音模式的持续时间建模，用于语言识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most existing duration modeling approaches facilitates phone recognizer and require manually annotated corpus to train the segmentation models, which is usually cost- and time-consuming. In this paper, a novel duration modeling approach is proposed, which does not require phone recognizer/annotated training data, and facilitates fast computation of language identification. In this approach, the segmentation is implemented by using articulatory features like voicing status. A pair of connected unvoiced and voiced segments is considered as the unit, and the duration of each segment is normalized for each utterance and then quantized into 20 discrete ranges. The ranges of units are later considered as symbol sequences and are modeled by n-gram models, to capture the temporal pattern, which is hypothesized to vary in different languages. The experiments based on the NIST LRE 2005 tasks show a relative 19.7% EER improvement by introducing the proposed duration modeling-based system into a fusion system containing two GMM-UBM based acoustic systems using MFCC and pitch+intensity features.

机译：大多数现有的持续时间建模方法有助于电话识别器，并且需要手动注释的语料库来训练分段模型，这通常是费时且费时的。在本文中，提出了一种新颖的持续时间建模方法，该方法不需要电话识别器/带注释的训练数据，并有助于快速进行语言识别的计算。在这种方法中，通过使用诸如语音状态之类的发音特征来实现分割。一对连接的清音和浊音片段被视为单位，每个片段的持续时间针对每种话语进行标准化，然后量化为20个离散范围。单位范围后来被认为是符号序列，并通过n元语法模型进行建模，以捕获时间模式，该时间模式被假定为在不同语言中有所不同。通过将建议的基于持续时间建模的系统引入包含两个使用MFCC和音高+强度功能的基于GMM-UBM的声学系统的融合系统，基于NIST LRE 2005任务的实验显示出相对EER的提高了19.7％。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP 2009》|2009年|4341-4344|共4页
会议地点 Taipei(CT);Taipei(CT)
作者
Bo Yin; Ambikairajah, E.; Fang Chen;
展开▼
作者单位

Sch. of Electr. Eng. & Telecommun. Univ. of New South Wales Sydney NSW;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
natural language processing; speech recognition; GMM-UBM; MFCC; acoustic systems; duration modeling approaches; duration modeling-based system; fusion system; language identification; phone recognizer; pitch+intensity features; segmentation models; unvoiced pattern; voicing status; articulatory features; duration modeling; quantization;

机译：自然语言处理；语音识别; GMM-UBM； MFCC;音响系统；持续时间建模方法；基于持续时间建模的系统；融合系统语言识别；电话识别器；音调+强度特征；细分模型；清音模式语音状态；发音特征；持续时间建模；量化;

相似文献

外文文献
中文文献
专利

1. Automatic Identification of Silence, Unvoiced and Voiced Chunks in Speech [J] . Poonam Sharma, Abha Kiran Rajpoot Computer Science & Information Technology . 2013,第5期

机译：在演讲中自动识别沉默，清音和浊音
2. Perceptual Wavelet packet transform based Wavelet Filter Banks Modeling of Human Auditory system for improving the intelligibility of voiced and unvoiced speech: A Case Study of a system development [J] . Ranganadh Narayanam International Journal on Computer Science and Engineering . 2015,第10期

机译：基于感知小波包变换的人类听觉系统小波滤波器组建模，以提高浊音和清音的清晰度：以系统开发为例
3. A pitch pattern modeling technique using dynamic features on the border of voiced and unvoiced segments [J] . Heiga Zen, Keiichi Tokuda, Takashi Masuko, 電子情報通信学会技術研究報告. 信号処理. Signal Processing . 2001,第323期

机译：在有声和无声段的边界上使用动态特征的音高模式建模技术
4. VOICED/UNVOICED PATTERN-BASED DURATION MODELING FOR LANGUAGE IDENTIFICATION [C] . Bo Yin, Eliathamby Ambikairajah, Fang Chen IEEE International Conference on Acoustics, Speech, and Signal Processing . 2009

机译：语言识别的浊音/清晰的模式持续时间建模
5. Logic, formal languages, and formal language identification. Some logical properties of the languages in the Chomsky hierarchy, and an interrogative model of formal language identification. [D] . Pylkko, Pauli Olavi. 1988

机译：逻辑，形式语言和形式语言标识。乔姆斯基层次结构中语言的某些逻辑属性，以及形式语言标识的疑问模型。
6. Stimulus control analysis of language disorders: A study of substitution between voiced and unvoiced consonants [O] . Alcione G. Brasolotto, Julio C. de Rose, Lawrence T. Stoddard, 1993

机译：语言障碍的刺激控制分析：有声和无声辅音之间的替代研究
7. Japanese Language Learnersu27 Listening Ability of Voiced-Unvoiced Sounds and Special Mora of the Japanese Dialects [O] . 堀口純子, ホリグチスミコ, Horiguchi Sumiko 1993

机译：日语学习者 u27清音的听觉能力和日语方言的特殊道德

Voiced/unvoiced pattern-based duration modeling for language identification

摘要

著录项

相似文献

相关主题

期刊订阅