【24h】

An Ensemble Approach to Cross-Domain Authorship Attribution

机译:跨域作者归属的集成方法

获取原文

摘要

This paper presents an ensemble approach to cross-domain authorship attribution that combines predictions made by three independent classifiers, namely, standard character n-grams, character n-grams with non-diacritic distortion and word n-grams. Our proposal relies on variable-length n-gram models and multinomial logistic regression to select the prediction of highest probability among the three models as the output for the task. The present approach is compared against a number of baseline systems, and we report results based on both the PAN-CLEF 2018 test data, and on a new corpus of song lyrics in English and Portuguese.
机译:本文提出了一种跨域作者归因的整体方法,该方法结合了三个独立分类器的预测,即标准字符n-gram,具有非音素符号失真的字符n-gram和单词n-gram。我们的建议依靠可变长度n-gram模型和多项逻辑回归来选择这三个模型中概率最高的预测作为任务的输出。将本方法与许多基准系统进行了比较,我们基于PAN-CLEF 2018测试数据以及新的英语和葡萄牙语歌词集报告了结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号