首页> 外文会议>International conference on natural language processing >Authorship Attribution in Bengali Language
【24h】

Authorship Attribution in Bengali Language

机译:孟加拉语作者身份归属

获取原文

摘要

We describe Authorship Attribution of Bengali literary text. Our contributions include a new corpus of 3,000 passages written by three Bengali authors, an end-to-end system for authorship classification based on character n-grams, feature selection for authorship attribution, feature ranking and analysis, and learning curve to assess the relationship between amount of training data and test accuracy. We achieve state-of-the-art results on held-out dataset, thus indicating that lexical n-gram features are unarguably the best discriminators for authorship attribution of Bengali literary text.
机译:我们描述孟加拉语文学著作的作者身份。我们的贡献包括由三位孟加拉语作者撰写的3,000个段落的新语料库,基于字符n-gram的端到端的作者身份分类系统,作者属性的特征选择,特征排名和分析以及评估关系的学习曲线在训练数据量和测试准确性之间。我们在保留的数据集上获得了最新的结果,从而表明词汇n-gram特征无疑是孟加拉语文学文本作者身份归因的最佳判别器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号