...
首页> 外文期刊>Language Resources and Evaluation >Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus
【24h】

Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus

机译:释放杀手语料库:创建多功能AMI会议语料库的经验

获取原文
获取原文并翻译 | 示例
           

摘要

The AMI Meeting Corpus contains 100 h of meetings captured using many synchronized recording devices, and is designed to support work in speech and video processing, language engineering, corpus linguistics, and organizational psychology. It has been transcribed orthographically, with annotated subsets for everything from named entities, dialogue acts, and summaries to simple gaze and head movement. In this written version of an LREC conference keynote address, I describe the data and how it was created. If this is "killer" data, that presupposes a platform that it will "sell"; in this case, that is the NITE XML Toolkit, which allows a distributed set of users to create, store, browse, and search annotations for the same base data that are both time-aligned against signal and related to each other structurally.
机译:AMI会议语料库包含使用许多同步记录设备捕获的100小时会议,旨在支持语音和视频处理,语言工程,语料库语言学和组织心理学方面的工作。它以正交方式进行转录,带有注释的子集,包括从命名实体,对话行为,摘要到简单的注视和头部移动的所有内容。在LREC会议主题演讲的书面版本中,我描述了数据及其创建方式。如果这是“杀手级”数据,那么该平台将“出售”。在这种情况下,这就是NITE XML Toolkit,它允许一组分布式用户为相同的基础数据创建,存储,浏览和搜索批注,这些批注与信号在时间上对齐并且在结构上相互关联。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号