首页> 外文会议>IEEE International Conference on Big Data >Robustness of emotion extraction from 20th century English books
【24h】

Robustness of emotion extraction from 20th century English books

机译:从20 TH 世纪的情感提取的鲁棒性

获取原文

摘要

In this paper, we test the robustness of emotion extraction from English language books published in the 20th century. Our analysis is performed on a sample of the 8 million digitized books available in the Google Books Ngram corpus by applying three independent emotion detection tools: WordNet Affect, Linguistic Inquiry and Word Count, and a recently proposed ‘Hedonometer’ method. We also assess the statistical robustness of the extracted patterns as well as their outputs on specific parts of speech. The analysis confirms three main results: the existence of recognizable periods of positive and negative ‘literary affect’ from 1900 to 2000, a general decrease in the usage of emotion-related words in printed books that lasts at least until the 1980s, and, finally, a divergence between American and British books, with the former using more emotion-related words from the 1960s.
机译:在本文中,我们测试了在20 th 世纪发表的英语语言书籍的情感提取稳健性。我们的分析是通过应用三个独立的情感检测工具:Wordnet影响,语言查询和字数,以及最近提出的“Hedonometer”方法,对Google书籍Ngram Corpus中提供的800万个数字化书籍的样本进行了分析。我们还评估提取模式的统计稳健性以及它们对语音的特定部分的产出。该分析证实了三个主要结果:从1900年到2000年的肯定和负面“文学影响”的可识别期内的存在,一般减少了在持续到20世纪80年代的印刷书中的情感相关词汇,最后,美国和英国书籍之间的分歧,前者使用20世纪60年代的更多情感相关的单词。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号