【24h】

Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis

机译:自适应集合:用于政治文件分析的无监督域自适应

获取原文

摘要

Insightful findings in political science often require researchers to analyze documents of a certain subject or type, yet these documents are usually contained in large corpora that do not distinguish between pertinent and non-pertinent documents. In contrast, we can find corpora that label relevant documents but have limitations (e.g., from a single source or era), preventing their use for political science research. To bridge this gap, we present adaptive ensembling, an unsupervised domain adaptation framework, equipped with a novel text classification model and time-aware training to ensure our methods work well with diachronic corpora. Experiments on an expert-annotated dataset show that our framework outperforms strong benchmarks. Further analysis indicates that our methods are more stable, leam better representations, and extract cleaner corpora for fine-grained analysis.
机译:政治学上有见地的发现通常要求研究人员分析某些主题或类型的文档,但是这些文档通常包含在大型语料库中,不能区分相关文档和非相关文档。相反,我们可以找到标记相关文档但具有局限性(例如,来自单一来源或时代)的语料库,从而阻止它们用于政治科学研究。为了弥合这一差距,我们提出了自适应合奏,这是一种无监督的领域自适应框架,配备有新颖的文本分类模型和可感知时间的训练,可确保我们的方法与历时语料库一起使用。在具有专家注释的数据集上进行的实验表明,我们的框架优于强大的基准测试。进一步的分析表明,我们的方法更稳定,具有更好的代表性,并提取了更清晰的语料库以进行细粒度的分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号