首页> 外文会议>International conference on audio, language and image processing >Integrating Acoustic and Lexical Features in Topic Segmentation of Chinese Broadcast News Using Maximum Entropy Approach
【24h】

Integrating Acoustic and Lexical Features in Topic Segmentation of Chinese Broadcast News Using Maximum Entropy Approach

机译:使用最大熵方法将声学和词法特征集成在中国广播新闻的主题分割中

获取原文

摘要

This paper studies how to integrate multi-modal features in automatic topic segmentation of Mandarin broadcast news. The multi-modal feature integration problem is formulated within the Maximum Entropy (MaxEnt) scheme for topic boundary classification by maximizing the entropy and respecting all known constraints (i.e., multiple features contributions). We particularly consider two types of features: (1) acoustic features, which reflect the editorial prosody of broadcast news, including pause duration, speaker change and speech type; and (2) lexical features extracted from speech recognition transcripts, which capture the semantic shifts of topics, including two local cohesiveness features and a new boundary indicator based on overall cohesiveness. Compared to local lexical features, the new overall cohesiveness feature maximizes the lexical cohesiveness of all topic fragments and reflects the fact that topic transitions in broadcast news are smooth and the distributional variations are subtle. Experiments show apparent performance improvement in topic segmentation of Chinese broadcast news by fusing acoustic and lexical features within the MaxEnt scheme.
机译:本文研究如何在普通话广播新闻中的自动主题分段中集成多模态特征。通过最大化熵和尊重所有已知约束(即多个功能贡献),在主题边界分类的最大熵(MaxEnt)方案中配制了多模态特征集成问题。我们特别考虑两种特征:(1)声学特征,反映广播新闻的编辑硕士,包括暂停持续时间,扬声器变化和语音类型; (2)从语音识别转录物中提取的词汇特征,捕获主题的语义偏移,包括两个基于整体凝聚力的局部凝聚性功能和新的边界指示器。与局部词汇特征相比,新的整体凝聚力最大化了所有主题碎片的词汇凝聚力,并反映了广播新闻中的主题转换的事实是平滑的,分布变化是微妙的。实验表明,通过融合了最大方案中的声学和词汇特征,表现出汉广播新闻的主题细分表观改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号