首页> 外文会议>International Conference on Text, Speech and Dialogue >Fast Phonetic/Lexical Searching in the Archives of the Czech Holocaust Testimonies: Advancing Towards the MALACH Project Visions
【24h】

Fast Phonetic/Lexical Searching in the Archives of the Czech Holocaust Testimonies: Advancing Towards the MALACH Project Visions

机译:捷克大屠杀证明档案中的快速语音/词汇搜索:推向玛​​拉采集项目的愿景

获取原文

摘要

In this paper we describe the system for a fast phonetic/lexical searching in the large archives of the Czech holocaust testimonies. The developed system is the first step to a fulfillment of the MALACH project visions [1,2], at least as for an easier and faster access to the Czech part of the archives. More than one thousand hours of spontaneous, accented and highly emotional speech of Czech holocaust survivors stored at the USC Shoah Foundation Institute as video-interviews were automatically transcribed and phonetically/lexically indexed. Special attention was paid to processing of colloquial words that appear very frequently in the Czech spontaneous speech. The final access to the archives is very fast allowing to detect segments of interviews containing pronounced words, clusters of words presented in pre-defined time intervals, and also words that were not included in the working vocabulary (OOV words).
机译:在本文中,我们描述了在捷克大屠杀标题的大型档案中进行快速语音/词汇搜索系统。发达的系统是实现Malach项目愿景[1,2]的第一步,至少是更容易和更快地访问档案的捷克部分。捷克大屠杀幸存者的超过一千小时的自发,重音和高度情感讲话,储存在USC Shoah基金会研究所作为视频面试,被自动转录和语音/词汇索引。特别关注在捷克自发演讲中经常出现的口语单词的处理。对归档的最终访问非常快,允许检测包含发音文字的面试段,以预定定义的时间间隔呈现的单词集群,以及不包括在工作词汇表中的单词(OOV字)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号