首页> 外文会议>International Conference on Language Resources and Evaluation >Stigma Annotation Scheme and Stigmatized Language Detection in Health-Care Discussions on Social Media
【24h】

Stigma Annotation Scheme and Stigmatized Language Detection in Health-Care Discussions on Social Media

机译:耻辱注释计划和社交媒体卫生保健讨论中的耻辱语言检测

获取原文

摘要

Much research has been done within the social sciences on the interpretation and influence of stigma on human behaviour and health, which result in out-of-group exclusion, distancing, cognitive separation, status loss, discrimination, in-group pressure, and often lead to disengagement, non-adherence to treatment plan, and prescriptions by the doctor. However, little work has been conducted on computational identification of stigma in general and in social media discourse in particular. In this paper, we develop the annotation scheme and improve the annotation process for stigma identification, which can be applied to other health-care domains. The data from pro-vaccination and anti-vaccination discussion groups are annotated by trained annotators who have professional background in social science and health-care studies, therefore the group can be considered experts on the subject in comparison to non-expert crowd. Amazon MTurk annotators is another group of annotator with no knowledge on their education background, they are initially treated as non-expert crowd on the subject matter of stigma. We analyze the annotations with visualisation techniques, features from LIWC (Linguistic Inquiry and Word Count) list and make prediction based on bi-grams with traditional and deep learning models. Data augmentation method and application of CNN show high performance accuracy in comparison to other models. Success of the rigorous annotation process on identifying stigma is reconfirmed by achieving high prediction rate with CNN.
机译:社会科学在社会科学中进行了许多研究,对人类行为和健康的解释和影响,这导致群体不排除,远端,认知分离,地位损失,歧视,集团压力,以及经常引导脱离,不遵守治疗计划,医生处方。然而,特别是在诸如社交媒体话语中的耻辱的计算鉴定时进行了很少的作品。在本文中,我们开发了注释方案,并改善了耻辱识别的注释过程,可应用于其他保健域。来自促疫苗接种和反疫苗接种讨论组的数据是通过在社会科学和医疗保健研究中具有专业背景的训练有素的注释器的注释,因此与非专家人群相比,该集团可被视为对象的专家。亚马逊MTurk注释器是另一组注释器,没有关于他们的教育背景的知识,它们最初被视为耻辱主题的非专家人群。我们分析了具有可视化技术的注释,LIWC(语言查询和字数)列表的特征,并根据具有传统和深度学习模型的双克进行预测。数据增强方法和CNN的应用显示与其他模型相比的高性能精度。通过使用CNN实现高预测率来重新确认严格注释过程的成功。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号