首页> 外文期刊>Neurocomputing >Story co-segmentation of Chinese broadcast news using weakly-supervised semantic similarity
【24h】

Story co-segmentation of Chinese broadcast news using weakly-supervised semantic similarity

机译:使用虚弱的语义相似性的中国广播新闻的故事共同分割

获取原文
获取原文并翻译 | 示例

摘要

This paper presents lexical story co-segmentation, a new approach to automatically extracting stories on the same topic from multiple Chinese broadcast news documents. Unlike topic tracking and detection, our approach needs not the guidance of well-trained topic models and can consistently segment the common stories from input documents. Following the MRF scheme, we construct a Gibbs energy function that feasibly balances the intra-doc and inter-doc lexical semantic dependencies and solve story co-segmentation as a binary labeling problem at sentence level. Due to the significance of measuring lexical semantic similarity in story co-segmentation, we propose a weakly-supervised correlated affinity graph (WSCAG) model to effectively derive the latent semantic similarities between Chinese words from the target corpus. Based on this, we are able to extend the classical cosine similarity by mapping the observed words distribution into the latent semantic space, which leads to a generalized lexical cosine similarity measurement. Extensive experiments on benchmark dataset validate the effectiveness of our story co-segmentation approach. Besides, we specifically demonstrate the superior performance of the proposed WSCAG semantic similarity measure over other state-of-the-art semantic measures in story co-segmentation. (C) 2019 Published by Elsevier B.V.
机译:本文展示了词汇故事共同分割,从多个中文广播新闻文件中自动提取故事的新方法。与主题跟踪和检测不同,我们的方法不需要训练有素的主题模型的指导,并且可以一致地将常见故事分段从输入文档中分段。在MRF方案之后,我们构建了GIBBS能量函数,可公平地余额余额,余额划分的内部词汇表语义依赖性,并将故事协同分段作为句子级别的二进制标记问题解决。由于衡量故事共分割中词汇语义相似性的意义,我们提出了一种弱监督的相关亲和力图(WSCAG)模型,以有效地推导出来自目标语料库的中文单词之间的潜在语义相似性。基于此,我们能够通过将观察到的单词分布映射到潜伏语义空间来扩展经典余弦相似度,这导致广义词汇余弦相似度测量。基准数据集的广泛实验验证了我们故事共分割方法的有效性。此外,我们专门展示了所提出的WSCAG语义相似性测量在故事共同分割中的其他最新语义措施的卓越性能。 (c)2019年由elestvier b.v发布。

著录项

  • 来源
    《Neurocomputing》 |2019年第25期|121-133|共13页
  • 作者单位

    Tianjin Univ Coll Intelligence & Comp Sch Comp Sci & Technol Tianjin 300350 Peoples R China|State Adm Cultural Heritage Key Res Ctr Surface Monitoring & Anal Cultural Re Beijing Peoples R China;

    Natl Univ Singapore Dept Elect & Comp Engn Singapore Singapore;

    Tianjin Univ Coll Intelligence & Comp Sch Comp Sci & Technol Tianjin 300350 Peoples R China|State Adm Cultural Heritage Key Res Ctr Surface Monitoring & Anal Cultural Re Beijing Peoples R China;

    City Univ Hong Kong Sch Creat Media Hong Kong Peoples R China;

    Tianjin Univ Coll Intelligence & Comp Sch Comp Sci & Technol Tianjin 300350 Peoples R China|JAIST Sch Informat Sci Nomi Japan;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Story co-segmentation; Weakly-supervised correlated affinity graph (WSCAG); Parallel affinity propagation; Generalized cosine similarity; Chinese broadcast news; MRF;

    机译:故事共同分割;弱监督相关亲和力图(WSCAG);并行亲和力传播;广义余弦相似;中国广播新闻;MRF;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号