...
首页> 外文期刊>Signal Processing, IET >Two-stage spoken term detection system for under-resourced languages
【24h】

Two-stage spoken term detection system for under-resourced languages

机译:资源低调语言的两级口语术语检测系统

获取原文
获取原文并翻译 | 示例
           

摘要

Spoken Term Detection (STD) is the process of locating the occurrences of spoken queries in a given speech database. Generally, two methods are adopted for STD: an ASR based sequence matching and ASR-free, feature-based template matching. If a well-performing ASR is available, the former STD method is accurate. However, to build an ASR with consistent performance, several hours of labelled corpora is required. Template matching methods work well for small or chopped utterances. However, in practice, the volume of the search database can be huge, containing sentences of varying lengths. Hence time complexity of template matching techniques will be high, which makes them impractical for realistic search applications. In this work, a two-stage STD system is proposed, which combines the ASR-based phoneme sequence matching in the first stage and feature sequence template matching of selected locations in the second stage. The time complexity of the second stage is reduced by performing DTW-based template matching only at probable query locations identified by the first stage. 'Split and match' approach helps to reduce the false-positives in case of longer query words. Effectiveness of the proposed method is demonstrated using English and Malayalam datasets.
机译:口语术语检测(STD)是在给定的语音数据库中定位出现的语音查询的发生的过程。通常,STD采用了两种方法:基于ASR的序列匹配和无ASR的基于特征的模板匹配。如果具有良好的ASR可用,则前STD方法是准确的。但是,要构建具有一致性性能的ASR,需要几个小时的标记基层。模板匹配方法适用于小或切碎的话语。但是,在实践中,搜索数据库的体积可以是巨大的,包含不同长度的句子。因此,模板匹配技术的时间复杂性将很高,这使得它们对于现实搜索应用程序不切实际。在这项工作中,提出了一种两阶段的STD系统,其将基于ASR的音素序列匹配与第二阶段中的所选位置的特征序列模板匹配相结合。通过执行基于DTW的模板匹配仅在由第一阶段识别的可能的查询位置执行的基于DTW的模板匹配来减少第二阶段的时间复杂度。 “拆分和匹配”方法有助于减少在更长的查询单词时的假阳性。使用英语和马拉雅拉姆数据集来证明所提出的方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号