【24h】

A New Approach for Authorship Attribution

机译:作者归因的新方法

获取原文

摘要

Authorship attribution is a text classification technique, which is used to find the author of an unknown document by analyzing the documents of multiple authors. The accuracy of author identification mainly depends on the writing styles of the authors. Feature selection for differentiating the writing styles of the authors is one of the most important steps in the authorship attribution. Different researchers proposed a set of features like character, word, syntactic, semantic, structural, and readability features to predict the author of a unknown document. Few researchers used term weight measures in authorship attribution. Term weight measures have proven to be an effective way to improve the accuracy of text classification. The existing approaches in authorship attribution used the bag-of-words approach to represent the document vectors. In this work, a new approach is proposed, wherein the document weight is used to represent the document vector instead of using features or terms in the document. The experimentation is carried out on reviews corpus with various classifiers, and the results achieved for author attribution are prominent than most of the existing approaches.
机译:作者归属是一种文本分类技术,用于通过分析多个作者的文档来查找未知文档的作者。作者识别的准确性主要取决于作者的写作风格。用于区分作者写作风格的功能选择是作者归属中最重要的步骤之一。不同的研究人员提出了一组特征,如字符,单词,句法,语义,结构和可读性功能,以预测未知文档的作者。少数研究人员在作者归属中使用了术语重量措施。一项重量措施已被证明是提高文本分类准确性的有效方法。 Autheration attution中的现有方法使用了文字袋方法来表示文档向量。在这项工作中,提出了一种新方法,其中文档权重用于表示文档向量,而不是在文档中使用特征或术语。该实验是在评论中进行了各种分类器的评论语料库,而作者归因的结果突出于大多数现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号