首页> 外文期刊>Natural language engineering >Robust stylometric analysis and author attribution based on tones and rimes
【24h】

Robust stylometric analysis and author attribution based on tones and rimes

机译:基于色调和边缘的稳健的笔势分析和作者署名

获取原文
获取原文并翻译 | 示例

摘要

In this article, we propose an innovative and robust approach to stylometric analysis without annotation and leveraging lexical and sub-lexical information. In particular, we propose to leverage the phonological information of tones and rimes in Mandarin Chinese automatically extracted from unannotated texts. The texts from different authors were represented by tones, tone motifs, and word length motifs as well as rimes and rime motifs. Support vector machines and random forests were used to establish the text classification model for authorship attribution. From the results of the experiments, we conclude that the combination of bigrams of rimes, word-final rimes, and segment-final rimes can discriminate the texts from different authors effectively when using random forests to establish the classification model. This robust approach can in principle be applied to other languages with established phonological inventory of onset and rimes.
机译:在本文中,我们提出了一种新颖而强大的样式分析方法,无需注释,也不会利用词汇和次词汇信息。特别是,我们建议利用从无注释文本中自动提取的汉语普通话的语调和语气的语音信息。来自不同作者的文本以音调,语调主题和字长主题以及雾rim和雾rim主题为代表。使用支持向量机和随机森林来建立作者归属的文本分类模型。根据实验结果,我们得出结论,当使用随机森林建立分类模型时,边缘的双字组,单词最终边缘和分段最终边缘的组合可以有效地区分不同作者的文本。原则上,这种可靠的方法可以应用于已建立起音和韵律的语音学清单的其他语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号