首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP 2009 >Genre effects on automatic sentence segmentation of speech: A comparison of broadcast news and broadcast conversations
【24h】

Genre effects on automatic sentence segmentation of speech: A comparison of broadcast news and broadcast conversations

机译:类型对语音自动句子分割的影响:广播新闻和广播对话的比较

获取原文

摘要

We investigate genre effects on the task of automatic sentence segmentation, focusing on two important domains - broadcast news (BN) and broadcast conversation (BC). We employ an HMM model based on textual and prosodic information and analyze differences in segmentation accuracy and feature usage between the two genres using both manual and automatic speech transcripts. Experiments are evaluated using Czech broadcast corpora annotated for sentence-like units (SUs). Prosodic features capture information about pause, duration, pitch, and energy patterns. Textual knowledge sources include words, part-of-speech, and automatically induced classes. We also analyze effects of using additional textual data that is not annotated for SUs. Feature analysis reveals significant differences in both textual and prosodic feature usage patterns between the two genres. The analysis is important for building automatic understanding systems when limited matched-genre data are available, or for designing eventual genre-independent systems.
机译:我们调查类型对自动句子分割任务的影响,重点放在两个重要领域-广播新闻(BN)和广播对话(BC)。我们使用基于文本和韵律信息的HMM模型,并使用手动和自动语音笔录分析两种类型之间的切分精度和特征使用差异。实验使用捷克广播语料库进行注释,该句子语料标注为句型单元(SU)。韵律特征捕获有关暂停,持续时间,音调和能量模式的信息。文字知识源包括单词,词性和自动归类。我们还分析了使用未注释SU的其他文本数据的影响。特征分析揭示了两种类型在文本和韵律特征使用模式上的显着差异。当有限的匹配类型数据可用时,该分析对于构建自动理解系统或设计最终独立于类型的系统非常重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号