首页> 外文会议>European signal processing conference;EUSIPCO 2009 >VIDEO AND AUDIO BASED DETECTION OF FILLED HESITATION PAUSES IN CLASSROOM LECTURES
【24h】

VIDEO AND AUDIO BASED DETECTION OF FILLED HESITATION PAUSES IN CLASSROOM LECTURES

机译:基于视频和音频的课堂演讲中犹豫不决暂停的检测

获取原文

摘要

In this paper we study the detection of hesitation filled pauses in oral presentations of university lectures taught in the Greek language and recorded using a tablet PC via a specialized software. We suggest a hierarchical approach fusing video data with audio data for increasing the precision rate in our detection system. The detection method works at frame level rather than the usual segmental level for more accurate synchronization of audio and video data after removing the detected hesitations. Audio characteristics are modeled using Gaussian Mixture Models while the stationarity of the recorded video is taken into account. This efficient video and audio combination yields higher precision and recall rates comparing with other works in the literature. On a dataset of approximately 7 hours the precision rate is 99.6% while the recall rate is 84.7% when audio and video data are taken into account.
机译:在本文中,我们研究了在希腊语授课的大学演讲的口头演示中检测到的犹豫填充停顿的情况,并使用平板电脑通过专用软件进行记录。我们建议采用一种分层方法,将视频数据与音频数据融合在一起,以提高检测系统的准确率。该检测方法在帧级别而不是通常的分段级别上工作,以便在消除检测到的犹豫之后更准确地同步音频和视频数据。使用高斯混合模型对音频特性进行建模,同时考虑录制视频的平稳性。与文献中的其他作品相比,这种有效的视频和音频组合产生了更高的精度和召回率。在大约7个小时的数据集上,考虑到音频和视频数据,查准率是99.6%,召回率是84.7%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号