首页> 外文会议>International Conference on Signal Processing and Communication Systems >Classification of story-telling and poem recitation using head gesture of the talker
【24h】

Classification of story-telling and poem recitation using head gesture of the talker

机译:使用讲话者的手势对故事和诗歌朗诵进行分类

获取原文

摘要

In this work, we investigate the nature of head gestures in spontaneous speech during story-telling in comparison to that in poem recitation. We hypothesize that head gestures during poem recitation would be more repetitive and structured compared to those in case of spontaneous speech. To quantify this, we proposed a measure called degree of repetition (DoR). We also perform a story-telling vs poem recitation classification experiment using deep neural network (DNN). For the classification, both DoR as well as context dependent raw head gesture data are used. Analysis and experiments are performed using a database of 24 subjects each telling five stories and a different set of 10 subjects each reciting 20 poems, three times each, thus having data of comparable durations for story telling and poem recitation. Analysis of head gestures using DoR reveals that the DoR, on average, is higher during poem recitation compared to that during story-telling. A four-fold classification experiment between story-telling and poem recitation using DNN demonstrates that the raw head gestures result in an average classification accuracy of 85.79% and an average F-score of 89.05% while the DoR results in an average accuracy and F-score of 80.59% and 82.30% respectively indicating that the features learnt by DNN from raw head gestures are more discriminative than DoR features. While these accuracy and F-score are less than those (94.67% & 95.60%) obtained using acoustic feature such as Mel frequency cepstral coefficients (MFCCs), raw head gestures and MFCCs together yield a higher average accuracy (98.62%) and F-score (98.92%), indicating that the head gestures are complementary to the acoustic features for the classification task.
机译:在这项工作中,我们在故事讲述期间调查了在自发演讲中的头部手势的性质与诗歌朗诵。我们假设与自发语音相比,诗歌朗诵期间的头部手势更加重复和结构。为了量化这一点,我们提出了一种称为重复程度(DOR)的措施。我们还使用深神经网络(DNN)进行故事讲述VS Poem Coleating分类实验。对于分类,使用DOR以及上下文相关的原始头手势数据。使用24个受试者的数据库进行分析和实验,每个主题讲述五个故事和不同的10个受试者每个诵读20首诗,每个诗歌三次,因此具有用于故事讲述和诗歌朗诵的可比持续时间的数据。使用DOR的头部手势分析显示,与故事讲解期间,诗歌朗诵期间,DOR平均更高。使用DNN的故事讲解和诗歌朗诵之间的四倍分类实验表明,原始头部手势的平均分类精度为85.79%,平均f分为89.05%,而DOR导致平均精度和f-得分为80.59%和82.30%,表明DNN从原始头部手势学到的特征比DOR功能更为差异。虽然这些精度和F分数小于使用声学特征(如MEL频率肌肉系数(MFCC),原始头部手势和MFCC)一起获得的那些(94.67%和95.60%),但是,加上较高的平均精度(98.62%)和f-得分(98.92%),表明头部手势与分类任务的声学功能互补。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号