首页> 外文会议> >An Off-the-shelf Approach to Authorship Attribution
【24h】

An Off-the-shelf Approach to Authorship Attribution

机译:作者权限归属的现成方法

获取原文

摘要

Authorship detection is a challenging task due to many design choices the user has to decide on. The performance highly depends on the right set of features, the amount of data, in-sample vs. out-of-sample settings, and profile- vs. instance-based approaches. So far, the variety of combinations renders off-the-shelf methods for authorship detection inappropriate. We propose a novel and generally deployable method that does not share these limitations. We treat authorship attribution as an anomaly detection problem where author regions are learned in feature space. The choice of the right feature space for a given task is identified automatically by representing the optimal solution as a linear mixture of multiple kernel functions (MKL). Our approach allows to include labelled as well as unlabelled examples to remedy the in-sample and out-of-sample problems. Empirically, we observe our proposed novel technique either to be better or on par with baseline competitors. However, our method relieves the user from critical design choices (e.g., feature set) and can therefore be used as an off-the-shelf method for authorship attribution.
机译:由于用户必须决定许多设计选择,因此作者身份检测是一项具有挑战性的任务。性能在很大程度上取决于正确的功能集,数据量,样本内和样本外设置以及基于配置文件和实例的方法。到目前为止,各种组合使得用于作者身份检测的现成方法不合适。我们提出了一种不共享这些局限性的新颖且可普遍部署的方法。我们将作者身份归类为异常检测问题,其中在特征空间中了解作者区域。通过将最佳解决方案表示为多个内核函数(MKL)的线性混合,可以自动识别为给定任务选择的正确特征空间。我们的方法允许包括标记的和未标记的示例,以纠正样本内和样本外问题。从经验上讲,我们观察到我们提出的新技术要么更好,要么与基线竞争对手相当。但是,我们的方法使用户摆脱了关键的设计选择(例如功能集),因此可以用作作者归属的现成方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号