首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation
【24h】

A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation

机译:基于无限潜谐波分配的非参数贝叶斯多音高分析仪

获取原文
获取原文并翻译 | 示例

摘要

The statistical multipitch analyzer described in this paper estimates multiple fundamental frequencies (F0s) in polyphonic music audio signals produced by pitched instruments. It is based on hierarchical nonparametric Bayesian models that can deal with uncertainty of unknown random variables such as model complexities (e.g., the number of F0s and the number of harmonic partials), model parameters (e.g., the values of F0s and the relative weights of harmonic partials), and hyperparameters (i.e., prior knowledge on complexities and parameters). Using these models, we propose a statistical method called infinite latent harmonic allocation (iLHA). To avoid model-complexity control, we allow the observed spectra to contain an unbounded number of sound sources (F0s), each of which is allowed to contain an unbounded number of harmonic partials. More specifically, to model a set of time-sliced spectra, we formulated nested infinite Gaussian mixture models based on hierarchical and generalized Dirichlet processes. To avoid manual tuning of influential hyperparameters, we put noninformative hyperprior distributions on them in a hierarchical manner. For efficient Bayesian inference, we used a modern technique called collapsed variational Bayes. In comparative experiments using audio recordings of piano and guitar solo performances, iLHA yielded promising results and we found that there would be room for improvement based on modeling of temporal continuity and spectral smoothness.
机译:本文所述的统计多音高分析仪可估算音高乐器产生的复音音乐音频信号中的多个基频(F0)。它基于分层的非参数贝叶斯模型,可以处理未知随机变量的不确定性,例如模型复杂性(例如,F0的数量和谐波分量的数量),模型参数(例如,F0的值和相对权重)。谐波分量)和超参数(即,关于复杂性和参数的先验知识)。使用这些模型,我们提出了一种统计方法,称为无限潜在谐波分配(iLHA)。为了避免模型复杂性控制,我们允许观察到的频谱包含无限数量的声源(F0),每个声源都包含无限数量的谐波分量。更具体地说,为了建模一组时间切片光谱,我们基于分层和广义Dirichlet过程制定了嵌套的无限高斯混合模型。为了避免手动调整有影响力的超参数,我们以分层的方式将非信息性超优先级分布放在它们上面。为了进行有效的贝叶斯推理,我们使用了一种称为折叠变分贝叶斯的现代技术。在使用钢琴和吉他独奏表演的录音进行的比较实验中,iLHA产生了令人鼓舞的结果,我们发现基于时间连续性和频谱平滑度的建模存在改进的空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号