首页> 外文期刊>IEEE transactions on audio, speech and language processing >Towards Link Characterization From Content: Recovering Distributions From Classifier Output
【24h】

Towards Link Characterization From Content: Recovering Distributions From Classifier Output

机译:从内容走向链接表征:从分类器输出中恢复分布

获取原文
获取原文并翻译 | 示例

摘要

In processing large volumes of speech and language data, we are often interested in the distribution of languages, speakers, topics, etc. For large data sets, these distributions are typically estimated at a given point in time using pattern classification technology. It is well known that such estimates can be highly biased, especially for rare classes. While these biases have been addressed in some applications, they have thus far been ignored in the speech and language literature. This neglect causes significant error for low-frequency classes. Correcting this biased distribution involves exploiting uncertain knowledge of the classifier error patterns. We describe a numerical method, the Metropolis驴Hastings (M驴H) algorithm, which provides a Bayes estimator for the distribution. We experimentally evaluate this algorithm for a speaker recognition task, demonstrating a fivefold reduction in root mean squared error.
机译:在处理大量语音和语言数据时,我们通常对语言,说话者,主题等的分布感兴趣。对于大型数据集,通常使用模式分类技术在给定的时间点估计这些分布。众所周知,这样的估计可能会有很大的偏差,尤其是对于稀有类。尽管这些偏见已在某些应用中得到解决,但迄今为止,它们在语音和语言文献中都被忽略了。忽视这一点会导致低频类别的重大误差。纠正这种有偏的分布涉及利用分类器错误模式的不确定知识。我们描述了一种数值方法,MetropolisKEYHastings(MKEYH)算法,该算法为分布提供了贝叶斯估计量。我们通过实验评估了该算法的说话人识别任务,证明了均方根误差减少了五倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号