N-Gram FST Indexing for Spoken Term Detection

机译：N-Gram FST索引用于语音术语检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

An efficient indexing scheme is essentially important for spoken term detection (STD) on large databases, particularly for phone-based systems that have been widely adopted to achieve vocabulary-independent detection. While the finite state transducer (FST) composition provides a standard indexing approach, the n-gram reverse indexing is more flexible in connectivity representation and confidence measuring and therefore may result in better performance than searching within the original lattices or the equivalent FSTs. In this paper we present an n-gram FST indexing approach which combines the flexibility of n-gram indexing and the efficiency of FST indexing. Specifically, we employ the n-gram indexing to relax connectivity in original lattices and then formalize the indices into an FST for online search. We demonstrate this approach with a phone-based STD task where the lattice is sparse due to strong language models. The results show that n-gram FST indexing provides not only better detection performance than lattice search, but also a faster detection than both conventional n-gram and FST indexing.

机译：对于大型数据库上的语音术语检测（STD），尤其是对于已被广泛采用以实现与词汇无关的检测的基于电话的系统，有效的索引方案至关重要。虽然有限状态换能器（FST）组合提供了一种标准的索引方法，但n-gram反向索引在连接性表示和置信度测量方面更为灵活，因此与在原始晶格或等效FST中进行搜索相比，其性能可能更好。在本文中，我们提出了一种n-gram FST索引方法，该方法结合了n-gram索引的灵活性和FST索引的效率。具体来说，我们使用n-gram索引来放松原始格中的连接性，然后将索引形式化为FST以进行在线搜索。我们通过基于电话的STD任务演示了这种方法，该任务由于强大的语言模型而使得晶格稀疏。结果表明，n-gram FST索引不仅比格搜索提供更好的检测性能，而且比常规n-gram和FST索引都提供更快的检测。

著录项

来源
《Annual conference of the International Speech Communication Association》|2012年|2091-2094|共4页
会议地点
作者
Chao Liu; Dong Wang; Javier Tejedor;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
spoken term indexing; finite state trans-ducer; spoken term detection; speech recognition;

机译：语音术语索引;有限状态传感器语音术语检测;语音识别;

相似文献

外文文献
中文文献
专利

1. Handling OOVWords in Mandarin Spoken Term Detection with an Hierarchical n-Gram Language Model [J] . WANG Xuyang, ZHANG Pengyuan, NA Xingyu, 电子学报（英文版） . 2017,第006期

机译：使用分层n-Gram语言模型处理普通话口语检测中的OOVWords
2. Handling OOV Words in Mandarin Spoken Term Detection with an Hierarchical n-Gram Language Model [J] . WANG Xuyang1, ZHANG Pengyuan1, NA Xingyu1, 电子学报：英文版 . 2017,第006期

机译：用分层N-GRAM语言模型处理普通话语言术语检测的OOV字
3. Spoken Term Detection Using SVM-Based Classifier Trained with Pre-Indexed Keywords [J] . Kentaro DOMOTO, Takehito UTSURO, Naoki SAWADA, IEICE transactions on information and systems . 2016,第10期

机译：使用基于SVM的分类器和预索引关键字训练的语音术语检测
4. N-Gram FST Indexing for Spoken Term Detection [C] . Chao Liu, Dong Wang, Javier Tejedor INTERSPEECH 2012 . 2012

机译：N-GRAM FST索引用于口语术语检测
5. Adaptation and Augmentation: Towards Better Rescoring Strategies for Automatic Speech Recognition and Spoken Term Detection [D] . Ma, Min. 2018

机译：适应和增强：寻求更好的自动语音识别和语音术语检测的评分策略
6. Use of N-grams and Term Relationship Graphs in the Syndrome Definition Development Process [O] . Nimi Idaikkadar, Nelson Adekoya, Aaron Kite-Powell, 2019

机译：N-gram和项关系图在综合征定义开发过程中的使用
7. N-gram FST indexing for spoken term detection [O] . Liu, Chao, Wang, Dong, Tejedor Noguerales, Javier 2012

机译：用于口语术语检测的N-gram FsT索引

N-Gram FST Indexing for Spoken Term Detection

摘要

著录项

相似文献

相关主题

期刊订阅