【24h】

N-GRAM-BASED AUTHOR PROFILES FOR AUTHORSHIP ATTRIBUTION

机译:基于N-GRAM的作者属性简介

获取原文
获取原文并翻译 | 示例

摘要

We present a novel method for computer-assisted authorship attribution based on character-level n-gram author profiles, which is motivated by an almost-forgotten, pioneering method in 1976. The existing approaches to automated authorship attribution implicitly build author profiles as vectors of feature weights, as language models, or similar. Our approach is based on byte-level n-grams, it is language independent, and the generated author profiles are limited in size. The effectiveness of the approach and language independence are demonstrated in experiments performed on English, Greek, and Chinese data. The accuracy of the results is at the level of the current state of the art approaches or higher in some cases.
机译:我们提出了一种基于字符级n-gram作者简介的计算机辅助作者属性归因的新方法,该方法是由1976年几乎被遗忘的开创性方法所激发的。现有的自动作者属性归因方法隐式地将作者简介作为矢量的载体功能权重(如语言模型或类似模型)。我们的方法基于字节级n-gram,它与语言无关,并且生成的作者个人资料的大小受到限制。对英语,希腊语和中文数据进行的实验证明了该方法的有效性和语言独立性。结果的准确性在某些情况下处于当前技术水平的水平或更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号