首页> 外文期刊>Data & Knowledge Engineering >A novel domain and event adaptive tweet augmentation approach for enhancing the classification of crisis related tweets
【24h】

A novel domain and event adaptive tweet augmentation approach for enhancing the classification of crisis related tweets

机译:一种新的域名和事件自适应推文增强方法,用于加强危机相关推文的分类

获取原文
获取原文并翻译 | 示例
       

摘要

One of the purposes of detecting the crisis related tweets is the ability to single out the tweets that provide information about the helps needed and offered. Classification of such tweets is difficult because of the unavailability of sufficient annotated tweets in those categories. To facilitate such classifications, a domain and event adaptive augmentation approach is proposed. The main objective of the research is to enhance the classification of crisis related tweets that have less training samples. The proposed algorithms are designed to integrate the innate domain and event information during the selection of words for augmentation. Components such as CrisisLex lexicon, Word2Vec embeddings and WordNet are utilized for the proposed augmentation. Experimentation is carried out to substantiate the benefits of augmentation. Results indicate increased performance of the classifier when provided with the expanded dataset including the augmented and original tweets. To combat the problem of overfitting and class imbalance arising due to the lesser training samples, a novel tweets augmentation algorithm can be utilized. The advantage in the proposed algorithms is the ability to retain the structure and inherent nature of the tweets during the augmentation.
机译:检测危机相关推文的一个目的是能够拨出提供有关所需帮助的信息的推文。由于这些类别中有足够的注释推文的不可用,因此难以进行此类推文的分类。为了促进此类分类,提出了域和事件自适应增强方法。该研究的主要目标是提高危机相关推文的分类,这些推文具有较少的培训样本。所提出的算法被设计为在选择增强时集成先天域和事件信息。诸如危机Lexicon,Word2Vec嵌入和Wordnet等组件用于建议的增强。进行实验以证实增强的益处。结果表示随着扩展的数据集提供包括增强和原始推文的扩展数据集时,分类器的性能提高。为了打击由于培训样本较小的训练样本而产生的过度装备和类别不平衡的问题,可以利用一种新的推文增强算法。所提出的算法中的优点是在增强期间能够保留推文的结构和固有性质。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号