首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Semantic Analysis and Organization of Spoken Documents Based on Parameters Derived From Latent Topics
【24h】

Semantic Analysis and Organization of Spoken Documents Based on Parameters Derived From Latent Topics

机译:基于潜在主题的参数对口语文档的语义分析和组织

获取原文
获取原文并翻译 | 示例

摘要

Spoken documents are audio signals and are thus not easily displayed on-screen and not easily scanned and browsed by the user. It is therefore highly desirable to automatically construct summaries, titles, latent topic trees and key term-based topic labels for these spoken documents to aid the user in browsing. We refer to this as semantic analysis and organization. Also, as network content is both copious and dynamic, with topics and domains changing everyday, the approaches here must be primarily unsupervised. We propose a framework for unsupervised semantic analysis and organization of spoken documents and for this purpose propose two measures derived from latent topic analysis: latent topic significance and latent topic entropy. We show that these can be integrated into an application system, with which the user can more easily navigate archives of spoken documents. Probabilistic latent semantic analysis is used as a typical example approach for unsupervised topic analysis in most experiments, although latent Dirichlet allocation is also used in some experiments to show that the proposed measures are equally applicable for different analysis approaches. All of the experiments were performed on Mandarin Chinese broadcast news.
机译:语音文档是音频信号,因此不容易在屏幕上显示,也不容易被用户扫描和浏览。因此,非常需要为这些口头文档自动构建摘要,标题,潜在主题树和基于关键术语的主题标签,以帮助用户浏览。我们将其称为语义分析和组织。另外,由于网络内容既丰富又动态,主题和领域每天都在变化,因此这里的方法必须基本上不受监督。我们提出了一种无监督语义分析和语音文档组织的框架,为此目的,提出了从潜在主题分析中得出的两种措施:潜在主题重要性和潜在主题熵。我们展示了它们可以集成到应用程序系统中,用户可以使用它更轻松地浏览语音文档的存档。在大多数实验中,概率潜在语义分析被用作无监督主题分析的典型示例方法,尽管在某些实验中还使用了潜在的Dirichlet分配来表明所提出的措施同样适用于不同的分析方法。所有的实验都是在中文广播新闻上进行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号