首页> 外文期刊>Neurocomputing >Story co-segmentation of Chinese broadcast news using weakly-supervised semantic similarity
【24h】

Story co-segmentation of Chinese broadcast news using weakly-supervised semantic similarity

机译:使用弱监督的语义相似度对中文广播新闻进行故事共分

获取原文
获取原文并翻译 | 示例
       

摘要

This paper presents lexical story co-segmentation, a new approach to automatically extracting stories on the same topic from multiple Chinese broadcast news documents. Unlike topic tracking and detection, our approach needs not the guidance of well-trained topic models and can consistently segment the common stories from input documents. Following the MRF scheme, we construct a Gibbs energy function that feasibly balances the intra-doc and inter-doc lexical semantic dependencies and solve story co-segmentation as a binary labeling problem at sentence level. Due to the significance of measuring lexical semantic similarity in story co-segmentation, we propose a weakly-supervised correlated affinity graph (WSCAG) model to effectively derive the latent semantic similarities between Chinese words from the target corpus. Based on this, we are able to extend the classical cosine similarity by mapping the observed words distribution into the latent semantic space, which leads to a generalized lexical cosine similarity measurement. Extensive experiments on benchmark dataset validate the effectiveness of our story co-segmentation approach. Besides, we specifically demonstrate the superior performance of the proposed WSCAG semantic similarity measure over other state-of-the-art semantic measures in story co-segmentation. (C) 2019 Published by Elsevier B.V.
机译:本文介绍了词汇故事的共同细分,这是一种从多个中文广播新闻文档中自动提取同一主题的故事的新方法。与主题跟踪和检测不同,我们的方法不需要训练有素的主题模型的指导,并且可以始终如一地分割输入文档中的常见故事。遵循MRF方案,我们构造了一个Gibbs能量函数,该函数可行地平衡了文档内和文档间词汇语义依赖性,并解决了故事共分段问题,将其作为句子级别的二进制标签问题。由于在故事共分段中测量词汇语义相似度的重要性,我们提出了一种弱监督相关相似度图(WSCAG)模型,以有效地从目标语料库中得出汉字之间的潜在语义相似度。在此基础上,我们可以通过将观察到的单词分布映射到潜在语义空间中来扩展经典余弦相似度,从而实现广义的词汇余弦相似度测量。在基准数据集上进行的大量实验验证了我们的故事联合细分方法的有效性。此外,我们专门展示了所提出的WSCAG语义相似性度量在故事共分段中优于其他最新语义度量的性能。 (C)2019由Elsevier B.V.发布

著录项

  • 来源
    《Neurocomputing》 |2019年第25期|121-133|共13页
  • 作者单位

    Tianjin Univ, Coll Intelligence & Comp, Sch Comp Sci & Technol, Tianjin 300350, Peoples R China|State Adm Cultural Heritage, Key Res Ctr Surface Monitoring & Anal Cultural Re, Beijing, Peoples R China;

    Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore;

    Tianjin Univ, Coll Intelligence & Comp, Sch Comp Sci & Technol, Tianjin 300350, Peoples R China|State Adm Cultural Heritage, Key Res Ctr Surface Monitoring & Anal Cultural Re, Beijing, Peoples R China;

    City Univ Hong Kong, Sch Creat Media, Hong Kong, Peoples R China;

    Tianjin Univ, Coll Intelligence & Comp, Sch Comp Sci & Technol, Tianjin 300350, Peoples R China|JAIST, Sch Informat Sci, Nomi, Japan;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Story co-segmentation; Weakly-supervised correlated affinity graph (WSCAG); Parallel affinity propagation; Generalized cosine similarity; Chinese broadcast news; MRF;

    机译:故事共分段;弱监督相关亲和图(WSCAG);平行亲和传播;广义余弦相似度;中文广播新闻;MRF;
  • 入库时间 2022-08-18 04:20:36

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号