首页> 外文会议>2010 IEEE Spoken Language Technology Workshop >Automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features
【24h】

Automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features

机译:使用分支熵和韵律/语义特征从口语课程讲座中自动提取关键术语

获取原文

摘要

This paper proposes a set of approaches to automatically extract key terms from spoken course lectures including audio signals, ASR transcriptions and slides. We divide the key terms into two types: key phrases and keywords and develop different approaches to extract them in order. We extract key phrases using right/left branching entropy and extract keywords by learning from three sets of features: prosodic features, lexical features and semantic features from Probabilistic Latent Semantic Analysis (PLSA). The learning approaches include an unsupervised method (K-means exemplar) and two supervised ones (AdaBoost and neural network). Very encouraging preliminary results were obtained with a corpus of course lectures, and it is found that all approaches and all sets of features proposed here are useful.
机译:本文提出了一套从口语课程中自动提取关键术语的方法,包括音频信号,ASR转录和幻灯片。我们将关键术语分为两种类型:关键短语和关键字,并开发了不同的方法来按顺序提取它们。我们使用右/左分支熵提取关键短语,并通过从概率潜在语义分析(PLSA)的三组特征中学习来提取关键词:韵律特征,词法特征和语义特征。学习方法包括无监督方法(K-means示例)和两种有监督方法(AdaBoost和神经网络)。通过一系列课程讲座获得了非常令人鼓舞的初步结果,并且发现这里提出的所有方法和所有功能都是有用的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号