首页> 外文会议>Searching spontaneous conversational speech workshop 2009 >Multimedia Retrieval Through Indexing Speech: An Enterprise Perspective
【24h】

Multimedia Retrieval Through Indexing Speech: An Enterprise Perspective

机译:通过索引语音进行多媒体检索:一种企业视角

获取原文
获取原文并翻译 | 示例

摘要

The institutional memory of enterprises is increasingly comprised of digital multimedia content, such as online lecture videos and presentations, archived meetings or conference calls, and voicemail. A key technology for efficiently managing such content is keyword search into the spoken audio content using automatic speech recognition (ASR).rnA key learning for deploying ASR-based indexing in enterprises is that multimedia content is often not stored in a centralized hosting application, but in a "long tail" of small teams' intranet sites, often built by technology enthusiasts who like to tinker and make creative use of technology. This calls for an indexing platform rather than a standalone app, audio indexing being one feature, easy to deploy with limited IT skills in a "do-it-yourself"-manner, and integrating with the existing information-management infrastructure.rnWe will present approaches to three enterprise-characteristic challenges arising from these requirements: (1) Probabilistic indexing of word lattices instead of speech-to-text transcripts, to address the limited recognition accuracy (often in the 50% range due to lack of matching acoustic/domain corpora); (2) phonetic search and vocabulary adaptation for indexing person names, domain terminology, and code names missing in a standard recognizer; and (3) approximations to implement probabilistic lattice indexing on top of existing industry-strength full-text search engines, for maximal reuse and integration with existing tools and deployments to reduce cost, and to enable non-speech experts to manage and operate indexing/search system and build/mesh-up line-of-business applications around it.
机译:企业的机构记忆越来越多地由数字多媒体内容组成,例如在线讲座视频和演示,已存档的会议或电话会议以及语音邮件。有效管理此类内容的一项关键技术是使用自动语音识别(ASR)在语音内容中进行关键字搜索。在企业中部署基于ASR的索引的一项关键学习是,多媒体内容通常不存储在集中式托管应用程序中,而是小型团队内部网站的“长尾巴”,通常由喜欢修改和创造性利用技术的技术爱好者建立。这需要索引平台而不是独立的应用程序,音频索引是一项功能,易于以“自己动手”的方式部署且具有有限的IT技能,并与现有的信息管理基础架构集成。rn这些要求所带来的三个企业特性挑战的方法:(1)单词索引的概率索引而不是语音到文本的转录本,以解决有限的识别精度(由于缺乏匹配的声学/领域,通常在50%的范围内)语料库); (2)语音搜索和词汇调整,以索引标准识别器中缺少的人员姓名,域名术语和代号; (3)在现有的行业实力强大的全文搜索引擎之上实施概率格索引的方法,以实现最大程度的重用以及与现有工具和部署的集成以降低成本,并使非语音专家能够管理和操作索引/搜索系统并围绕它构建/合并业务线应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号