Building Test Speech Dataset on Russian Language for Spoken Document Retrieval Task

机译：以俄语为语音文档检索任务构建测试语音数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The article presents a technique of creation of speech dataset which is applied for test of spoken document retrieval methods. The dataset includes radio news audio files with speech on Russian language, textual files with spoken words, textual files with recognition words from CMU Pocketsphinx and a set of queries with indication of relevant documents. Query words from the set is labeled with types of recognition errors which are determined word replacement, word distortion, word split and word deletion. The dataset contains expert's indication of documents which are relevant to queries.

机译：本文介绍了一种语音数据集创建技术，该技术可用于测试语音文档检索方法。数据集包括带有俄语语音的广播新闻音频文件，带有口语的文本文件，来自CMU Pocketsphinx的带有识别词的文本文件以及一组带有相关文档指示的查询。来自该集合的查询词被标记为识别错误的类型，这些错误由词替换，词失真，词拆分和词删除确定。数据集包含与查询相关的专家对文档的指示。

著录项

来源
《IEEE East-West Design and Test Symposium》|2018年|1-4|共4页
会议地点 Kazan(RU)
作者
Alexandra Tatarinova; Dmitriy Prozorov;
展开▼
作者单位

Vyatka State University;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech recognition; Dictionaries; Task analysis; Acoustics; Histograms; Hidden Markov models;

机译：语音识别;字典；任务分析；声学;直方图；隐马尔可夫模型;

相似文献

外文文献
中文文献
专利

1. Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task [J] . Masahiko MATSUSHITA, Hiromitsu NISHIZAKI, Takehito UTSURO, IEICE Transactions on Information and Systems . 2005,第3期

机译：通过组合多个语音识别器的输出以执行语音驱动的WEB检索任务，提高口语查询的关键字识别
2. SYLLABLE-BASED CHINESE TEXT/SPOKEN DOCUMENT RETRIEVAL USING TEXT/SPEECH QUERIES [J] . BO-REN BAI, BERLIN CHEN, HSIN-MIN WANG International Journal of Pattern Recognition and Artificial Intelligence . 2000,第5期

机译：基于文本/语音查询的基于音节的中文文本/语音文档检索
3. Statistical language models for query-by-example spoken document retrieval [J] . Paula Lopez-Otero, Javier Parapar, Alvaro Barreiro Multimedia Tools and Applications . 2020,第11a12期

机译：逐个示例统计语言模型进行查询语音文档检索
4. Building Test Speech Dataset on Russian Language for Spoken Document Retrieval Task [C] . Alexandra Tatarinova, Dmitriy Prozorov IEEE East-West Design amp;amp;amp; Test Symposium . 2018

机译：在俄语中构建测试语音数据集以便文档检索任务
5. Audio parsing and rapid speaker adaptation in speech recognition for spoken document retrieval. [D] . Zhou, Bowen. 2003

机译：语音识别中的音频解析和快速的说话人自适应，可用于语音文档检索。
6. Fast mapping semantic features: Performance of adults with normal language history of disorders of spoken and written language and attention deficit hyperactivity disorder on a word learning task [O] . Mary Alt, Michelle L. Gutmann -1

机译：快速映射语义特征：正常的语言口语和书面语言的障碍病史注意缺陷多动障碍的成年人的表现就一个字学习任务
7. Cross-Language Spoken Document Retrieval Using HMM-Based Retrieval Model with Multi-Scale Fusion [O] . Wai-kit Lo, Helen Meng, P. C. Ching 2009

机译：使用基于HMM的多尺度融合检索模型进行跨语言语音文档检索

Building Test Speech Dataset on Russian Language for Spoken Document Retrieval Task

摘要

著录项

相似文献

相关主题

期刊订阅