首页> 外文期刊>Journal of medical Internet research >Assessing Suicide Risk and Emotional Distress in Chinese Social Media: A Text Mining and Machine Learning Study
【24h】

Assessing Suicide Risk and Emotional Distress in Chinese Social Media: A Text Mining and Machine Learning Study

机译:评估中国社交媒体的自杀风险和情绪困境:文本挖掘和机器学习研究

获取原文
           

摘要

Background Early identification and intervention are imperative for suicide prevention. However, at-risk people often neither seek help nor take professional assessment. A tool to automatically assess their risk levels in natural settings can increase the opportunity for early intervention. Objective The aim of this study was to explore whether computerized language analysis methods can be utilized to assess one’s suicide risk and emotional distress in Chinese social media. Methods A Web-based survey of Chinese social media (ie, Weibo) users was conducted to measure their suicide risk factors including suicide probability, Weibo suicide communication (WSC), depression, anxiety, and stress levels. Participants’ Weibo posts published in the public domain were also downloaded with their consent. The Weibo posts were parsed and fitted into Simplified Chinese-Linguistic Inquiry and Word Count (SC-LIWC) categories. The associations between SC-LIWC features and the 5 suicide risk factors were examined by logistic regression. Furthermore, the support vector machine (SVM) model was applied based on the language features to automatically classify whether a Weibo user exhibited any of the 5 risk factors. Results A total of 974 Weibo users participated in the survey. Those with high suicide probability were marked by a higher usage of pronoun (odds ratio, OR=1.18, P =.001), prepend words (OR=1.49, P =.02), multifunction words (OR=1.12, P =.04), a lower usage of verb (OR=0.78, P <.001), and a greater total word count (OR=1.007, P =.008). Second-person plural was positively associated with severe depression (OR=8.36, P =.01) and stress (OR=11, P =.005), whereas work-related words were negatively associated with WSC (OR=0.71, P =.008), severe depression (OR=0.56, P =.005), and anxiety (OR=0.77, P =.02). Inconsistently, third-person plural was found to be negatively associated with WSC (OR=0.02, P =.047) but positively with severe stress (OR=41.3, P =.04). Achievement-related words were positively associated with depression (OR=1.68, P =.003), whereas health- (OR=2.36, P =.004) and death-related (OR=2.60, P =.01) words positively associated with stress. The machine classifiers did not achieve satisfying performance in the full sample set but could classify high suicide probability (area under the curve, AUC=0.61, P =.04) and severe anxiety (AUC=0.75, P <.001) among those who have exhibited WSC. Conclusions SC-LIWC is useful to examine language markers of suicide risk and emotional distress in Chinese social media and can identify characteristics different from previous findings in the English literature. Some findings are leading to new hypotheses for future verification. Machine classifiers based on SC-LIWC features are promising but still require further optimization for application in real life.
机译:背景技术早期识别和干预对于自杀预防是必不可少的。然而,风险往往既不寻求帮助也不是专业的评估。自动评估自然环境中风险水平的工具可以增加早期干预的机会。目的本研究的目的是探讨是否可以利用计算机化的语言分析方法来评估中国社交媒体的自杀风险和情绪困境。方法采用基于网络社交媒体(即微博)用户的网络调查,以衡量其自杀危险因素,包括自杀概率,微博自杀通信(WSC),抑郁,焦虑和压力水平。在公共领域发布的参与者的微博职位也被同意下载。将微博帖子解析并安装在简体中文查询和字数(SC-LIWC)类别中。 SC-LIWC功能与5个自杀式风险因素的关联被逻辑回归检查。此外,基于语言特征应用支持向量机(SVM)模型,以自动分类WEIBO用户是否展示了5个风险因素中的任何一个。结果共有974名微博用户参加了调查。具有高概率概率的那些代词的使用率(OTS比率或= 1.18,p = .001),prepend单词(或= 1.49,p = .02),多功能字(或= 1.12,p =。 04),动词的较低用法(或= 0.78,p <.001),更大的总字数(或= 1.007,p = .008)。第二人复数与严重抑郁(或= 8.36,p = .01)和应力(或= 11,p = .005)呈正相关,而工作相关词与WSC负相关(或= 0.71,p = .008),严重抑郁症(或= 0.56,p = .005),焦虑(或= 0.77,p = .02)。不一致地,发现第三人称复数与WSC(或= 0.02,p = .047)负相关,但具有严重的应激(或= 41.3,P = .04)。与抑郁症(或= 1.68,p = .003)呈正相关的逐渐相关的单词,而健康 - (或= 2.36,p = .004)和与死亡相关的(或= 2.60,p = .01)呈正相关联压力。机器分类器没有在完整样本集中达到满足性能,但可以将高自杀概率(曲线下的区域分类,AUC = 0.61,P = .04)和严重的焦虑(AUC = 0.75,P <.001)表现出WSC。结论SC-LIWC可用于检查中国社交媒体的自杀风险和情感困境的语言标志,可以识别英语文学中以前发现的特征。一些发现导致新假设以供将来核查。基于SC-LIWC功能的机器分类器是有希望的,但仍需要进一步优化在现实生活中的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号