...
首页> 外文期刊>Journal of the American Society for Information Science and Technology >Estimating the Probability of an Authorship Attribution
【24h】

Estimating the Probability of an Authorship Attribution

机译:估计著作权归属的可能性

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In authorship attribution, various distance-based metrics have been proposed to determine the most probable author of a disputed text. In this paradigm, a distance is computed between each author profile and the query text. These values are then employed only to rank the possible authors. In this article, we analyze their distribution and show that we can model it as a mixture of 2 Beta distributions. Based on this finding, we demonstrate how we can derive a more accurate probability that the closest author is, in fact, the real author. To evaluate this approach, we have chosen 4 authorship attribution methods (Burrows' Delta, Kullback-Leibler divergence, Labbe's intertextual distance, and the naive Bayes). As the first test collection, we have downloaded 224 State of the Union addresses (from 1790 to 2014) delivered by 41 U.S. presidents. The second test collection is formed by the Federalist Papers. The evaluations indicate that the accuracy rate of some authorship decisions can be improved. The suggested method can signal that the proposed assignment should be interpreted as possible, without strong certainty. Being able to quantify the certainty associated with an authorship decision can be a useful component when important decisions must be taken.
机译:在作者归属中,已经提出了各种基于距离的度量来确定争议文本的最有可能的作者。在此范例中,将计算每个作者资料和查询文本之间的距离。这些值仅用于对可能的作者进行排名。在本文中,我们分析了它们的分布,并表明可以将其建模为2 Beta分布的混合。基于此发现,我们演示了如何得出更精确的概率,即最接近的作者实际上是真实作者。为了评估这种方法,我们选择了4种作者身份归属方法(Burrows的Delta,Kullback-Leibler分歧,Labbe的互文距离和朴素的贝叶斯)。作为第一个测试集,我们下载了41位美国总统提供的224个国际电联地址(从1790年至2014年)。第二个测试集合由《联邦主义者论文》组成。评估表明,可以改善某些作者决定的准确率。所建议的方法可以发出信号,建议对所分配的任务应尽可能不加解释地进行解释。当必须做出重要的决定时,能够量化与作者身份决定相关的确定性可能是一个有用的组成部分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号