首页> 外文会议>International Symposium on Chinese Spoken Language Processing >A Multi-layered Summarization System for Multi-media Archives by Understanding and Structuring of Chinese Spoken Documents
【24h】

A Multi-layered Summarization System for Multi-media Archives by Understanding and Structuring of Chinese Spoken Documents

机译:通过了解和构建中文文献的多媒体档案的多层摘要系统

获取原文

摘要

The multi-media archives are very difficult to be shown on the screen, and very difficult to retrieve and browse. It is therefore important to develop technologies to summarize the entire archives in the network content to help the user in browsing and retrieval. In a recent paper [1] we proposed a complete set of multi-layered technologies to handle at least some of the above issues: (1) Automatic Generation of Titles and Summaries for each of the spoken documents, such that the spoken documents become much more easier to browse, (2) Global Semantic Structuring of the entire spoken document archive, offering to the user a global picture of the semantic structure of the archive, and (3) Query-based Local Semantic Structuring for the subset of the spoken documents retrieved by the user's query, providing the user the detailed semantic structure of the relevant spoken documents given the query he entered. The Probabilistic Latent Semantic Analysis (PLSA) is found to be helpful. This paper presents an initial prototype system for Chinese archives with the functions mentioned above, in which the broadcast news archive in Mandarin Chinese is taken as the example archive.
机译:多媒体档案非常难以在屏幕上显示,并且非常难以检索和浏览。因此,重要的是开发技术,总结网络内容中的整个档案,以帮助用户浏览和检索。在最近的一篇论文中,我们提出了一套完整的多层技术,以处理至少一些上述问题:(1)自动生成每个口语文件的标题和摘要,使得口语文件变得很大更易于浏览,(2)全局语义结构的整个口头文档存档,向用户提供归档语义结构的全局图片,并为(3)基于查询的本地语义结构,用于口语文档的子集由用户的查询检索,为用户提供相关语言文档的详细语义结构,因为他输入的查询。发现概率潜在语义分析(PLSA)有用。本文介绍了中文档案的初始原型系统,其中包含上述功能,其中汉语中汉语中的广播新闻作为示例档案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号