首页> 外文会议>Annual conference of the International Speech Communication Association >Confidence measure for speech indexing based on Latent Dirichlet Allocation
【24h】

Confidence measure for speech indexing based on Latent Dirichlet Allocation

机译:基于潜在狄利克雷分配的语音索引置信度度量

获取原文

摘要

This paper presents a confidence measure for speech indexing that aims to predict the indexing quality of a speech document for a Spoken Document Retrieval (SDR) task. We first introduce how the indexing quality of a speech document is evaluated. Then, we present our method to predict the indexing quality of a speech document. It is based on confidence measure provided by an automatic speech recognition system and the detection of semantic outliers implemented with the Latent Dirichlet Allocation (LDA) model. Experiments are conducted on the French Broadcast news campaign ESTER2 in a classical SDR scenario where users submit text-queries to a search engine. Results demonstrate an overall improvement when the detection is done with the LDA model. The detection rate is always above 70%.
机译:本文提出了一种语音索引的置信度度量,旨在预测语音文档检索(SDR)任务的语音文档的索引质量。我们首先介绍如何评估语音文档的索引质量。然后,我们提出了预测语音文档索引质量的方法。它基于自动语音识别系统提供的置信度度量以及使用潜在狄利克雷分配(LDA)模型实现的语义离群值检测。在传统的SDR场景中,法国广播新闻活动ESTER2进行了实验,用户将文本查询提交给搜索引擎。当使用LDA模型进行检测时,结果证明了整体改进。检出率始终高于70%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号