首页> 外文会议>Insternational Joint Conference on Natural Language Processing >Combining Labeled and Unlabeled Data for Learning Cross-document Structural Relationships
【24h】

Combining Labeled and Unlabeled Data for Learning Cross-document Structural Relationships

机译:结合标记和未标记的数据学习跨文档结构关系

获取原文

摘要

Multi-document discourse analysis has emerged with the potential of improving various NLP applications. Based on the newly proposed Cross-document Structure Theory (CST), this paper describes an empirical study that classifies CST relationships between sentence pairs extracted from topically related documents, exploiting both labeled and unlabeled data. We investigate a binary classifier for determining existence of structural relationships and a full classifier using the full taxonomy of relationships. We show that in both cases the exploitation of unlabeled data helps improve the performance of learned classifiers.
机译:多文件话语分析出现了改善各种NLP应用的潜力。 基于新提出的跨文档结构理论(CST),本文介绍了一个实证研究,可以在局部相关文档中提取的句子对之间分类CST关系,利用标记和未标记的数据。 我们调查二进制分类器,用于使用关系的全分类来确定结构关系的存在和完整分类器。 我们表明,在这两种情况下,剥削未标记的数据有助于提高学习分类器的性能。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号