首页> 外文学位 >Nonstationary time series modeling with applications to speech signal processing.

【24h】

Nonstationary time series modeling with applications to speech signal processing.

机译：非平稳时间序列建模及其在语音信号处理中的应用。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We develop statistical methods for the analysis of nonstationary time series and apply them to a variety of problems arising in speech signal processing. Information-carrying natural sound signals such as speech exhibit a degree of controlled nonstationarity in that their statistical properties vary slowly over time. Faithfully modeling these temporal variations is extremely valuable for a wide range of applications and can be accomplished by relying on well-understood acoustic models of speech production, which motivate many of the methods developed in this thesis.;First, we make a number of contributions to the classical problem of formant tracking, in which vocal tract resonances are estimated under the assumption of their invariance on the 15-30 ms scale. Next, we relax this piecewise-stationarity constraint and model the temporal dynamics of the vocal tract using time-varying autoregressive (TVAR) models. We develop their algebraic and geometric properties, introduce several new estimators, and use TVAR models to develop a hypothesis test to detect the presence of vocal tract variation in speech waveform data. We study its asymptotic properties, and illustrate its practical efficacy by detecting vocal tract changes across different timescales of speech dynamics.;Next, we explore how standard fixed-resolution short-time Fourier representations may be generalized in order to adapt to the time-frequency structure of a speech signal. To this end, we introduce a family of adaptive, linear time-frequency representations termed superposition frames and show that they are invertible, numerically-stable, and admit fast overlap-add reconstruction akin to standard short-time Fourier techniques. The general construction proceeds via a local signal-adaptive modification of a Gabor frame. Two signal-dependent schemes for selecting an appropriate superposition frame for signal analysis are given, and the framework is illustrated in the context of speech enhancement.;Finally, we introduce a joint model of the vocal tract and the source waveform in order to take into account its quasi-periodic temporal variations during voicing. We incorporate an estimate of the source waveform into the traditional linear prediction framework via nonparametric wavelet regression; the resultant semi-parametric model is applied to various speech analysis problems including formant and source-harmonics-to-noise ratio estimation, inverse filtering, and voicing detection.

机译：我们开发了统计方法来分析非平稳时间序列，并将其应用于语音信号处理中出现的各种问题。诸如语音之类的承载信息的自然声音信号表现出一定程度的非平稳性，因为它们的统计特性会随时间缓慢变化。忠实地建模这些时间变化对于广泛的应用非常有价值，并且可以通过依靠众所周知的语音生成声学模型来完成，这激发了本文中开发的许多方法。首先，我们做出了许多贡献共振峰跟踪的经典问题，即在15-30 ms尺度不变的情况下估计声道共振。接下来，我们放松此分段平稳性约束，并使用时变自回归（TVAR）模型对声道的时间动态进行建模。我们开发了它们的代数和几何特性，引入了几种新的估计量，并使用TVAR模型开发了假设检验来检测语音波形数据中声道变化的存在。我们研究其渐近性质，并通过检测语音动力学在不同时间尺度上的声道变化来说明其实际功效。;接下来，我们探索如何将标准的固定分辨率短时傅立叶表示推广以适应时频语音信号的结构。为此，我们介绍了一系列称为重叠帧的自适应线性时频表示，并证明它们是可逆的，数值稳定的，并且允许类似于标准短时傅立叶技术的快速重叠添加重建。总体构造是通过对Gabor帧进行局部信号自适应修改来进行的。给出了两种用于选择合适的叠加帧进行信号分析的信号相关方案，并在语音增强的背景下对该框架进行了说明。最后，我们引入了声道与源波形的联合模型，以考虑在发声时说明其准周期的时间变化。我们通过非参数小波回归将源波形的估计值合并到传统的线性预测框架中。所得的半参数模型可用于各种语音分析问题，包括共振峰和信源-谐波-噪声比估计，逆滤波和发声检测。

著录项

作者
Rudoy, Daniel.;
展开▼
作者单位

Harvard University.;

展开▼
授予单位 Harvard University.;
学科 Applied Mathematics.;Statistics.;Engineering Electronics and Electrical.
学位 Ph.D.
年度 2010
页码 349 p.
总页数 349
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Statistical properties of fluctuations of time series representing appearances of words in nationwide blog data and their applications: An example of modeling fluctuation scalings of nonstationary time series [J] . Watanabe Hayafumi, Sano Yukie, Takayasu Hideki, Physical review, E . 2016,第5aPta1期

机译：表示全国博客数据中单词出现的时间序列涨落的统计特性及其在非平稳时间序列涨落比例建模中的示例
2. An efficient representation of nonstationary signals usingmixed-transforms with applications to speech [J] . Mikhael W.B., Ramaswamy A. IEEE Transactions on Circuits and Systems. II, Express Briefs . 1995,第6期

机译：使用混合变换的非平稳信号的有效表示及其在语音中的应用
3. An efficient representation of nonstationary signals using mixed-transforms with applications to speech [J] . Mikhael W.B., Ramaswamy A. IEEE Transactions on Circuits and Systems. II . 1995,第6期

机译：使用混合变换的非平稳信号的有效表示及其在语音中的应用
4. Time-correlation analysis of nonstationary signals with application to speech processing [C] . Ta-Hsin Li, Gibson, J.D. . 1996

机译：非平稳信号的时间相关分析及其在语音处理中的应用
5. Model-driven Time-varying Signal Analysis and its Application to Speech Processing. [D] . Sandoval, Steven. 2016

机译：模型驱动的时变信号分析及其在语音处理中的应用。
6. A Study of Mexican Free-Tailed Bat Chirp Syllables: Bayesian Functional Mixed Models for Nonstationary Acoustic Time Series [O] . Josue G. MARTINEZ, Kirsten M. BOHN, Raymond J. CARROLL, -1

机译：墨西哥自由尾蝙蝠Chi音节的研究：非平稳声学时间序列的贝叶斯功能混合模型。
7. Statistical properties of fluctuations of time series representing the appearance of words in nationwide blog data and their applications: An example of observations and the modelling of fluctuation scalings of nonstationary time series [O] . Watanabe, Hayafumi, Sano, Yukie, Takayasu, Hideki, 2016

机译：表示时间序列波动的统计特性全国博客数据中的文字外观及其应用：观察的例子和波动标度的建模非平稳时间序列
8. Time-Varying Modeling for Speech Signals. Part 1. Mathematical Modeling of Time-Varying Speech [R] . Lee, Y. T. , Silverman, H. F. 1988

机译：语音信号的时变建模。第1部分。时变语音的数学建模

Nonstationary time series modeling with applications to speech signal processing.

摘要

著录项

相似文献

相关主题

期刊订阅