首页> 外文会议>Social Media Mining for Health Applications Workshop Shared Task;International Conference on Computational Linguistics >SMM4H Shared Task 2020 - A Hybrid Pipeline for Identifying Prescription Drug Abuse from Twitter: Machine Learning, Deep Learning, and Post-Processing
【24h】

SMM4H Shared Task 2020 - A Hybrid Pipeline for Identifying Prescription Drug Abuse from Twitter: Machine Learning, Deep Learning, and Post-Processing

机译:SMM4H共享任务2020 - 一种用于识别来自Twitter的处方药物滥用的混合管道:机器学习,深度学习和后处理

获取原文

摘要

This paper presents our approach to multi-class text categorization of tweets mentioning prescription medications as being indicative of potential abuse/misuse (A), consumption/non-abuse (C), mention-only (M), or an unrelated reference (U) using natural language processing techniques. Data augmentation increased our training and validation corpora from 13,172 tweets to 28,094 tweets. We also created word-embeddings on domain-specific social media and medical corpora. Our hybrid pipeline of an attention-based CNN with post-processing was the best performing system in task 4 of SMM4H 2020, with an F1 score of 0.51 for class A.
机译:本文介绍了我们对多级文本分类的推文提及处方药物的方法,指示潜在的滥用/滥用(a),消费/非滥用(c),仅提及(m)或不相关的参考(U. )使用自然语言处理技术。 数据增强从13,172推文到28,094推文的培训和验证语料。 我们还在特定于域的社交媒体和医疗Corpora上创建了Word-Embedings。 我们具有后处理的基于关注的CNN的混合管道是SMM4H 2020的任务4中的最佳性能,F1分数为A级为0.51。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号