首页> 外文期刊>The Arabian journal for science and engineering >A PRELIMINARY STUDY OF PROSODY-BASED DETECTION OF QUESTIONS IN ARABIC SPEECH MONOLOGUES
【24h】

A PRELIMINARY STUDY OF PROSODY-BASED DETECTION OF QUESTIONS IN ARABIC SPEECH MONOLOGUES

机译:基于语音的阿拉伯语语音单语检测问题的初步研究

获取原文
获取原文并翻译 | 示例
       

摘要

Prosody features have been widely used in many speech-related applications, including speaker and word recognition, emotion and accent identification, topic and sentence segmentation, and text-to-speech applications. Languages other than Arabic have received a lot of attention in this regard. An important application of prosodic features which is investigated here is that of identifying question sentences in Arabic monologue lectures. To our best knowledge, this is the first attempt at addressing question detection from spoken lectures in any language. To this end, we developed a small corpus made of 1028 utterances that were extracted from 15 Arabic spoken lectures. We approach this problem by first segmenting the continuous speech (recorded lectures) into sentences using both intensity and duration features. Prosodic features are, then, extracted from each sentence. These features are used as input to four different classifiers to classify each sentence into either a question or a non-question sentence.Our results suggest that questions are cued by more than one type of prosodic features in spontaneous Arabic speech. We classified questions with an accuracy of 77.43%. A feature-specific analysis further reveals that energy and fundamental frequency (F0) features are mainly responsible for discriminating between question and non-question sentences. In terms of classification, we found that a Bayes Network performs better than support vector machines, multi-layer perceptron neural networks, or decision trees on our dataset. Removal of correlated features through Correlation-based Feature Selection produced more efficient and accurate results than the complete feature set.
机译:韵律功能已广泛用于许多与语音相关的应用程序中,包括说话者和单词识别,情感和口音识别,主题和句子分段以及文本到语音的应用程序。在这方面,除阿拉伯语以外的其他语言受到了广泛关注。本文研究的韵律特征的重要应用是在阿拉伯独白讲座中识别疑问句。据我们所知,这是首次尝试解决任何语言的口语演讲中的问题。为此,我们开发了一个小型语料库,该语料库由1528种口语组成,并从15种阿拉伯语口语课中提取。我们通过首先使用强度和持续时间特征将连续语音(录制的演讲)分割成句子来解决这个问题。然后,从每个句子中提取韵律特征。这些特征被用作四个不同分类器的输入,以将每个句子分为一个疑问句或一个非疑问句。我们的结果表明,问题是由自发阿拉伯语语音中的一种以上韵律特征所暗示的。我们对问题进行分类的准确性为77.43%。特定于特征的分析进一步表明,能量和基频(F0)特征主要负责区分疑问句和非疑问句。在分类方面,我们发现贝叶斯网络在数据集上的性能优于支持向量机,多层感知器神经网络或决策树。通过基于相关的特征选择删除相关的特征所产生的结果比完整的特征集更为有效和准确。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号