首页> 美国政府科技报告 >MALCOM X: Combining maximum likelihood continuity mapping with Gaussian mixture models
【24h】

MALCOM X: Combining maximum likelihood continuity mapping with Gaussian mixture models

机译:maLCOm X:将最大似然连续性映射与高斯混合模型相结合

获取原文

摘要

GMMs are among the best speaker recognition algorithms currently available. However, the GMM's estimate of the probability of the speech signal does not change if the authors randomly shuffle the temporal order of the feature vectors, even though the actual probability of observing the shuffled signal would be dramatically different--probably near zero. A potential way to improve the performance of GMMs is to incorporate temporal information into the estimate of the probability of the data. Doing so could improve speech recognition, speaker recognition, and potentially aid in detecting lies (abnormalities) in speech data. As described in other documents (Hogden, 1996), MALCOM is an algorithm that can be used to estimate the probability of a sequence of categorical data. MALCOM can also be applied to speech (and other real valued sequences) if windows of the speech are first categorized using a technique such as vector quantization (Gray, 1984). However, by quantizing the windows of speech, MALCOM ignores information about the within-category differences of the speech windows. Thus, MALCOM and GMMs complement each other: MALCOM is good at using sequence information whereas GMMs capture within- category differences better than the vector quantization typically used by MALCOM. An extension of MALCOM (MALCOM X) that can be used for estimating the probability of a speech sequence is described here.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号