首页> 外文会议>10th Western Pacific Acoustics Conference. >Spoken Term Detection by Result Integration of Plural Subwords using Confidence Measure
【24h】

Spoken Term Detection by Result Integration of Plural Subwords using Confidence Measure

机译:基于置信度的多个子词结果集成的语音术语检测

获取原文
获取原文并翻译 | 示例

摘要

Spoken term detection (STD) systems should not restrict query terms because they are likely to be out-of-vocabulary (OOV) of a general speech recognizer such as technical terms or personal names. To detect OOV terms, we have proposed a method that uses subword such as monophone, triphone and newly proposed subwords such as 1/2phone and 1/3phone. The method enabled to detect OOV terms and the proposed subwords worked well compared with triphone. We could improve the STD performance by integrating plural detection results obtained from several subwords. Each utterance in spoken documents has plural scores obtained from the plural subwords. The integrated score is computed by giving a weight to the score of each subword and summing up the weighted scores. We note a confidence measure that is obtained from a subword based speech recognizer, when determining the weights. In case of integrating the plural scores obtained from several subwords for a certain section, the section that has a high confidence measure for a query is more reliable than other sections that have lower confidence measure. Therefore, the STD performance is improved by giving a higher weight to the subword that has a high confidence measure and low weights to the other subwords. The proposed method automatically determines the weights dynamically for each candidate section according to the confidence measures of the subword models in the candidate section. Through evaluation experiments using a lecture corpus, we could confirm that the proposed method improved the performance of a conventional linear integration method.
机译:语音术语检测(STD)系统不应限制查询术语,因为它们很可能是通用语音识别器的语音(OOV),例如技术术语或个人名称。为了检测OOV术语,我们提出了一种使用子字(例如单音,三音)和新提出的子字(例如1 / 2phone和1 / 3phone)的方法。与Triphone相比,该方法能够检测OOV术语,并且所提出的子词效果很好。通过整合从几个子词获得的多个检测结果,我们可以提高STD性能。语音文档中的每个话语都有从多个子词获得的多个分数。通过对每个子词的分数赋予权重并对加权分数求和,可以计算出综合分数。当确定权重时,我们注意到从基于子词的语音识别器获得的置信度度量。在针对某个部分对从多个子词获得的多个分数进行积分的情况下,具有较高置信度的查询部分比其他具有较低置信度的部分更可靠。因此,通过对具有高置信度的子词赋予较高的权重而对其他子词赋予较低的权重来提高STD性能。所提出的方法根据候选部分中子词模型的置信度自动为每个候选部分动态确定权重。通过使用演讲语料库的评估实验,我们可以确认所提出的方法改善了常规线性积分方法的性能。

著录项

  • 来源
  • 会议地点 Beijing(CN);Beijing(CN)
  • 作者单位

    Graduate School of Software and Information Science,Iwate Prefectural University,Iwate,Japan;

    Graduate School of Software and Information Science,Iwate Prefectural University,Iwate,Japan;

    Graduate School of Software and Information Science,Iwate Prefectural University,Iwate,Japan;

    Graduate School of Software and Information Science,Iwate Prefectural University,Iwate,Japan;

    University of Tsukuba,Japan;

    AIST,Japan;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 声学;声学;
  • 关键词

  • 入库时间 2022-08-26 14:23:07

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号