首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >Re-ranking of spoken term detections using CRF-based triphone detection models
【24h】

Re-ranking of spoken term detections using CRF-based triphone detection models

机译:使用基于CRF的三音检测模型对口语检测进行重新排序

获取原文

摘要

Conventional spoken term detection (STD) techniques, which use a text-based matching approach based on automatic speech recognition (ASR) systems, are not robust for speech recognition errors. This paper proposes a conditional random fields (CRF)-based re-ranking approach, which recomputes detection scores produced by a phoneme-based dynamic time warping (DTW) STD approach. In the re-ranking approach, we tackle STD as a sequence labeling problem. We use CRF-based triphone detection models based on features generated from multiple types of phoneme-based transcriptions. They train recognition error patterns such as phoneme-to-phoneme confusions on the CRF framework. Therefore, the models can detect a triphone, which is one of triphones composing a query term, with detection probability. In the experimental evaluation on the Japanese OOV test collection, the CRF-based approach alone could not outperform the conventional DTW-based approach we have already proposed; however, it worked well in the re-ranking (second-pass) process for the detections from the DTW-based approach. The CRF-based re-ranking approach made a 2.4% improvement of F-measure in the STD performance.
机译:使用基于自动语音识别(ASR)系统的基于文本的匹配方法的常规口语检测(STD)技术对于语音识别错误不是很可靠。本文提出了一种基于条件随机场(CRF)的重排序方法,该方法重新计算了基于音素的动态时间规整(DTW)STD方法产生的检测分数。在重新排序方法中,我们将STD视为序列标记问题。我们使用基于多种基于音素的转录类型生成的特征的基于CRF的三音素检测模型。他们在CRF框架上训练识别错误模式,例如音素对音素的混淆。因此,模型可以以检测概率来检测作为构成查询项的三音素之一的三音素。在对日本OOV测试集的实验评估中,仅基于CRF的方法不能超过我们已经提出的传统基于DTW的方法。但是,它在基于DTW的方法的检测的重新排序(第二遍)过程中效果很好。基于CRF的重新排序方法使STD性能中的F度量提高了2.4%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号