首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Story Segmentation and Topic Classification of Broadcast News via a Topic-Based Segmental Model and a Genetic Algorithm
【24h】

Story Segmentation and Topic Classification of Broadcast News via a Topic-Based Segmental Model and a Genetic Algorithm

机译:通过基于主题的细分模型和遗传算法对广播新闻进行故事细分和主题分类

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a two-stage approach to story segmentation and topic classification of broadcast news. The two-stage paradigm adopts a decision tree and a maximum entropy model to identify the potential story boundaries in the broadcast news within a sliding window. The problem for story segmentation is thus transformed to the determination of a boundary position sequence from the potential boundary regions. A genetic algorithm is then applied to determine the chromosome, which corresponds to the final boundary position sequence. A topic-based segmental model is proposed to define the fitness function applied in the genetic algorithm. The syllable- and word-based story segmentation schemes are adopted to evaluate the proposed approach. Experimental results indicate that a miss probability of 0.1587 and a false alarm probability of 0.0859 are achieved for story segmentation on the collected broadcast news corpus. On the TDT-3 Mandarin audio corpus, a miss probability of 0.1232 and a false alarm probability of 0.1298 are achieved. Moreover, an outside classification accuracy of 74.55% is obtained for topic classification on the collected broadcast news, while an inside classification accuracy of 88.82% is achieved on the TDT-2 Mandarin audio corpus.
机译:本文提出了一种广播新闻的故事分段和主题分类的两阶段方法。两阶段范式采用决策树和最大熵模型来识别滑动窗口内广播新闻中的潜在故事边界。故事分割的问题因此转变为从潜在边界区域确定边界位置序列。然后应用遗传算法确定染色体,该染色体对应于最终的边界位置序列。提出了一种基于主题的分段模型来定义在遗传算法中应用的适应度函数。采用基于音节和单词的故事分割方案来评估所提出的方法。实验结果表明,对于所收集的广播新闻语料库中的故事分割,实现了0.1587的遗漏概率和0.0859的虚警概率。在TDT-3普通话音频语料库上,实现了0.1232的未命中率和0.1298的误报率。此外,对于所收集的广播新闻的主题分类,外部分类精度为74.55%,而在TDT-2普通话音频语料库上,内部分类精度为88.82%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号