AN EMPIRICAL STUDY OF CONFUSION MODELING IN KEYWORD SEARCH FOR LOW RESOURCE LANGUAGES

机译：在低资源语言中的关键词搜索中混淆建模的实证研究

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Keyword search, in the context of low resource languages, has emerged as a key area of research. The dominant approach in keyword search is to use Automatic Speech Recognition (ASR) as a front end to produce a representation of audio that can be indexed. The biggest drawback of this approach lies in its the inability to deal with out-of-vocabulary words and query terms that are not in the ASR system output. In this paper we present an empirical study evaluating various approaches based on using confusion models as query expansion techniques to address this problem. We present results across four languages using a range of confusion models which lead to significant improvements in keyword search performance as measured by the Maximum Term Weighted Value (MTWV) metric.

机译：关键字搜索，在低资源语言的背景下，已成为研究的关键领域。关键字搜索中的主导方法是使用自动语音识别（ASR）作为前端，以产生可以索引的音频的表示。这种方法的最大缺点在于它无法处理不在ASR系统输出中的词汇单词和查询术语。在本文中，我们提出了一种基于使用混淆模型作为查询扩展技术来评估各种方法的实证研究，以解决这个问题。我们使用一系列混淆模型呈现四种语言的结果，这导致了通过最大术语加权值（MTWV）度量来测量的关键字搜索性能的显着改进。

著录项

来源
《Workshop on Automatic Speech Recognition and Understanding;ASRU 2013;IEEE Workshop on Automatic Speech Recognition and Understanding》|2013年||共6页
会议地点
作者
Murat Saraclar; Abhinav Sethy; Bhuvana Ramabhadran; Lidia Mangu; Jia Cui; Xiaodong Cui; Brian Kingsbury; Jonathan Mamou;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912.3-532;
关键词
ASR; Maximum Term; MTWV;

机译：ASR;最大术语;MTWV;

相似文献

外文文献
中文文献
专利

1. Cross-language phoneme mapping for phonetic search keyword spotting in continuous speech of under-resourced languages [J] . Ella Tetariy, Yossi Bar-Yosef, Vered Silber-Varod, Artificial Intelligence Research . 2015,第2期

机译：跨语言音素映射，用于在资源不足的语言的连续语音中发现语音搜索关键词
2. A Keyword-Aware Language Modeling Approach to Spoken Keyword Search [J] . Chen I-Fan, Ni Chongjia, Lim Boon Pang, Journal of VLSI signal processing systems . 2016,第2期

机译：语音关键词搜索的关键词感知语言建模方法
3. Feature learning for efficient ASR-free keyword spotting in low-resource languages [J] . Ewald van der Westhuizen, Herman Kamper, Raghav Menon, Computer speech and language . 2022,第Jana期

机译：特征学习以低资源语言的高效无论是无ASR的关键字拍摄
4. An empirical study of confusion modeling in keyword search for low resource languages [C] . Saraclar Murat, Sethy Abhinav, Ramabhadran Bhuvana, IEEE Workshop on Automatic Speech Recognition and Understanding . 2013

机译：低资源语言关键词搜索中混淆建模的实证研究
5. Turkic Interlingua: A Case Study of Machine Translation in Low-Resource Languages [D] . Mirzakhalov, Jamshidbek. 2021

机译：Turikic Interlingua：一种低资源语言机器翻译的案例研究
6. Enhancing African low-resource languages: Swahili data for language modelling [O] . Casper S. Shikali, Refuoe Mokhosi 2020

机译：增强非洲低资源语言：语言建模的斯瓦希里语数据
7. Quantifying the value of pronunciation lexicons for keyword search in low resource languages [O] . Guoguo Chen, Sanjeev Khudanpur, Daniel Povey, 2013

机译：量化低资源语言中关键词搜索的发音词典的价值

AN EMPIRICAL STUDY OF CONFUSION MODELING IN KEYWORD SEARCH FOR LOW RESOURCE LANGUAGES

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅