Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping

Paula Lopez-Otero; Javier Parapar; Alvaro Barreiro

首页> 外文期刊>Information Processing & Management >Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping

【24h】

Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping

机译：结合电话多媒体报表示和动态时间规整的高效示例查询口头文件检索

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Query-by-example spoken document retrieval (QbESDR) aims at finding those documents in a set that include a given spoken query. Current approaches are, in general, not valid for real-world applications, since they are mostly focused on being effective (i.e. reliably detecting in which documents the query is present) but practical implementations must also be efficient (i.e. the search must be performed in a limited time) in order to allow for a satisfactory user experience. In addition, systems usually search for exact matches of the query, which limits the number of relevant documents retrieved by the search. This paper proposes a representation of the documents and queries for QbESDR based on combining different-sized phone n-grams obtained from automatic transcriptions, namely phone multigram representation. Since phone transcriptions usually have errors, several hypotheses for the query transcriptions are combined in order to ease the impact of these errors. The proposed system stores the document in inverted indices, which leads to fast and efficient search. Different combinations of the phone multigram strategy with a state-of-art system based on pattern matching using dynamic time warping (DTW) are proposed: one consists in a two-stage system that intends to be as effective but more efficient than a DTW-based system, while the other aims at improving the performance achieved by these two systems by combining their output scores. Experiments performed on the MediaEval 2014 Query-by-Example Search on Speech (QUESST 2014) evaluation framework suggest that the phone multigram representation for QbESDR is a successful approach, and the assessed combinations with a DTW-based strategy lead to more efficient and effective QbESDR systems. In addition, the phone multigram approach succeeded in increasing the detection of non-exact matches of the queries.

机译：通过示例查询的语音文档检索（QbESDR）旨在查找集合中包含给定语音查询的那些文档。通常，当前的方法对于现实世界的应用程序无效，因为它们主要集中在有效（即可靠地检测查询中存在哪些文档）上，但实际的实现也必须有效（即，必须在有限的时间），以便获得令人满意的用户体验。此外，系统通常会搜索查询的精确匹配项，这会限制搜索所检索到的相关文档的数量。本文通过结合从自动转录中获得的不同大小的电话n-gram（即电话多义字表示），提出了文档表示和对QbESDR的查询。由于电话抄写通常存在错误，因此将查询抄写的几种假设组合在一起，以减轻这些错误的影响。所提出的系统将文档存储在倒排索引中，从而实现快速有效的搜索。提出了将电话多报策略与基于使用动态时间规整（DTW）进行模式匹配的最新系统的不同组合：一种由两阶段系统组成，该系统比DTW更加有效，但效率更高。一个基于系统的系统，另一个目标是通过结合它们的输出分数来提高这两个系统所实现的性能。在MediaEval 2014语音逐例查询（QUESST 2014）评估框架上进行的实验表明，针对QbESDR的电话多字母表示法是一种成功的方法，并且结合基于DTW的策略进行评估后，QbESDR更加有效。系统。另外，电话多重语法方法成功地增加了对查询的非精确匹配的检测。

著录项

来源
《Information Processing & Management》 |2019年第1期|43-60|共18页
作者
Paula Lopez-Otero; Javier Parapar; Alvaro Barreiro;
展开▼
作者单位

Universidade da Coruña - CITIC;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Query-by-example spoken document retrieval; Phone decoding; Phone n-grams; Phone posteriorgrams; Dynamic time warping;

机译：示例查询语音文档检索;电话解码;电话n-gram;电话后验;动态时间扭曲;

相似文献

外文文献
中文文献
专利

1. Statistical language models for query-by-example spoken document retrieval [J] . Paula Lopez-Otero, Javier Parapar, Alvaro Barreiro Multimedia Tools and Applications . 2020,第11a12期

机译：逐个示例统计语言模型进行查询语音文档检索
2. Adaptation of Multilingual Subphonetic Segment for Spoken Document Retrieval [J] . Shi-wook LEE, Kazuyo TANAKA, Yoshiaki ITOH 電子情報通信学会技術研究報告. 音声. Speech . 2003,第519期

机译：多语言子音段的改编为语音文档检索。
3. Adaptation of Multilingual Subphonetic Segment for Spoken Document Retrieval [J] . Shi-wook LEE, Kazuyo TANAKA, Yoshiaki ITOH 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2003,第517期

机译：多语言子音段的改编为语音文档检索。
4. Combining Word and Phonetic-Code Representations for Spoken Document Retrieval [C] . Alejandro Reyes-Barragan, Manuel Montes-y-GOmez, Luis Villasenor-Pineda Annual conference on intelligent text processing and computational linguistics;CICLing 2011 . 2011

机译：结合单词和语音代码表示进行语音文档检索
5. Music Retrieval System Using Dynamic Time Warping [D] . Okafor, Emeka Jude. 2019

机译：音乐检索系统使用动态时间翘曲
6. Twadn: an efficient alignment algorithm based on time warping for pairwise dynamic networks [O] . Yuanke Zhong, Jing Li, Junhao He, 2020

机译：TWADN：基于对成对动态网络的时间翘曲的高效对准算法
7. Combining Word and Phonetic-Code Representations for Spoken Document Retrieval [O] . Ro Reyes-barragán, Manuel Montes-y-gómez, Luis Villaseñor-pineda 2013

机译：结合语音文本检索的Word和语音代码表示
8. Real-Time Spoken-Language System for Interactive Problem-Solving, Combining Linguistic and Statistical Technology for Improved Spoken Language Understanding. [R] . Moore, R. C., Cohen, M. H. 1993

机译：交互式问题解决的实时语言系统，结合语言和统计技术提高口语理解能力。

Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅