【24h】

Prosody recognition in male infant-directed speech

机译:男性婴儿语音中的韵律识别

获取原文

摘要

Robots designed to learn from and interact with humans require an intuitive method for humans to communicate with them. Normal human speech is very difficult to process, requiring many kinds of complex analysis for robots to interpret it. An intermediate method for communication is recognition of prosody, the affective content of speech. Using prosody recognition, a human interacting with a robot can reward or punish its actions by scolding or praising it. In this project, prosody recognition of male voices is performed by feature-based analysis of sound files containing short utterances, which were recorded from subjects who were directed to emulate infant-directed speech, which generally contains exaggerated prosody (Breazeal, C and Aryanada, L, 2000). The features used are extracted from the energy and pitch contours in the preprocessing stage. The classifier discriminates amongst four affective classes of speech and neutral utterances. The four classes are prohibition, attentional bids, approval, and soothing, while the neutral utterances are speech, which carries none of the above affective intents. Discrimination is performed using a multistage k-nearest neighbor classifier. The five-way single-stage classifier operates at 62.5 accuracy on the entire male speech data set, while the female single-stage classifier classifies 66.7 percent correctly. Chi-square analysis resulted in a p of less than or equal to 0.001 for each. The data seem to indicate that while female voice data may be somewhat easier to classify than male, fundamental differences that make male utterances unsuitable for classification do not exist.
机译:旨在向人类学习并与人类互动的机器人需要一种直观的方法与人类进行交流。正常的人类语音很难处理,需要机器人进行多种复杂的分析才能解释。交流的一种中间方法是识别韵律,即语音的情感内容。使用韵律识别,与机器人交互的人可以通过责骂或称赞它来奖励或惩罚它的行为。在这个专案中,男性语音的韵律识别是通过对包含简短话语的声音文件进行基于特征的分析来完成的,这些声音文件是从模拟婴儿定向语音的对象中录制的,通常包含夸大的韵律(Breazeal,C和Aryanada, L,2000)。在预处理阶段,从能量轮廓和音高轮廓中提取使用的特征。分类器在语音和中性话语的四个情感类之间进行区分。这四个类别是禁止,注意,认可和抚慰,而中立的言语是言语,没有上述情感意图。使用多级k最近邻分类器进行区分。五路单级分类器在整个男性语音数据集上的准确度为62.5,而女性单级分类器则正确分类了66.7%。卡方分析得出每个p均小于或等于0.001。数据似乎表明,尽管女性语音数据可能比男性更容易分类,但不存在导致男性言语不适合分类的根本差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号