首页> 外文会议>International Workshop on Health Text Mining and Information Analysis >Identifying Personal Experience Tweets of Medication Effects Using Pre-trained RoBERTa Language Model and Its Updating
【24h】

Identifying Personal Experience Tweets of Medication Effects Using Pre-trained RoBERTa Language Model and Its Updating

机译:使用预先培训的Roberta语言模型和更新识别个人体验的药物效果的推文

获取原文

摘要

Post-market surveillance, the practice of monitoring the safe use of pharmaceutical drugs is an important part of pharmacovigilance. Being able to collect personal experience related to pharmaceutical product use could help us gain insight into how the human body reacts to different medications. Twitter, a popular social media service, is being considered as an important alternative data source for collecting personal experience information with medications. Identifying personal experience tweets is a challenging classification task in natural language processing. In this study, we utilized three methods based on Facebook's Robustly Optimized BERT Pretraining Approach (RoBERTa) to predict personal experience tweets related to medication use: the first one combines the pre-trained RoBERTa model with a classifier, the second combines the updated pre-trained RoBERTa model using a corpus of unlabeled tweets with a classifier, and the third combines the RoBERTa model that was trained with our unlabeled tweets from scratch with the classifier too. Our results show that all of these approaches outperform the published methods (Word Embedding + LSTM) in classification performance (p < 0.05), and updating the pre-trained language model with tweets related to medications could even improve the performance further.
机译:市场后监测,监测安全使用制药药物的实践是药物检测的重要组成部分。能够收集与药品用途相关的个人经验可以帮助我们深入了解人体如何对不同的药物作出反应。 Twitter是一个受欢迎的社交媒体服务,被视为用于使用药物收集个人体验信息的重要替代数据源。识别个人体验推文是自然语言处理中有挑战性的分类任务。在这项研究中,我们利用了基于Facebook的强大优化BERT预先预订方法(Roberta)的三种方法来预测与药物使用相关的个人体验推文:第一个将预先培训的Roberta模型与分类器结合,第二个结合了更新的预先培训了roberta模型使用与分类器的未标记推文的语料库,第三个组合了roberta模型,这些模型与我们的未标记的推文也与分类器的划痕训练。我们的结果表明,所有这些方法都优于分类性能(P <0.05)中发布的方法(Word Embedding + LSTM),并使用与药物相关的推文更新预先接受的语言模型,甚至可以进一步提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号