首页> 外文会议>Conference on empirical methods in natural language processing >That's So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets
【24h】

That's So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets

机译:太烦人了!!!:一种基于词法和框架语义嵌入的数据增强方法,使用#petpeeve Tweets自动对烦人的行为进行分类

获取原文

摘要

We propose a novel data augmentation approach to enhance computational behavioral analysis using social media text. In particular, we collect a Twitter corpus of the descriptions of annoying behaviors using the #petpeeve hashtags. In the qualitative analysis, we study the language use in these tweets, with a special focus on the fine-grained categories and the geographic variation of the language. In quantitative analysis, we show that lexical and syntactic features are useful for automatic categorization of annoying behaviors, and frame-semantic features further boost the performance; that leveraging large lexical embeddings to create additional training instances significantly improves the lexical model; and incorporating frame-semantic embedding achieves the best overall performance.
机译:我们提出了一种新颖的数据增强方法,以增强使用社交媒体文本的计算行为分析。特别是,我们使用#petpeeve主题标签收集了有关令人讨厌的行为的描述的Twitter语料库。在定性分析中,我们研究了这些推文中的语言使用,特别关注语言的细粒度类别和地理变化。在定量分析中,我们表明词汇和句法特征对于烦人行为的自动分类很有用,而框架语义特征则进一步提高了性能。利用大型词法嵌入来创建其他训练实例的方法大大改善了词法模型;并结合使用框架语义嵌入可实现最佳的整体性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号