Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study

Genghao Li; Bing Li; Langlin Huang; Sibing Hou

首页> 外文期刊>JMIR Medical Informatics >Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study

【24h】

Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study

机译：基于微博的抑郁域词典自动构建：文本挖掘研究

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background According to a World Health Organization report in 2017, there was almost one patient with depression among every 20 people in China. However, the diagnosis of depression is usually difficult in terms of clinical detection owing to slow observation, high cost, and patient resistance. Meanwhile, with the rapid emergence of social networking sites, people tend to share their daily life and disclose inner feelings online frequently, making it possible to effectively identify mental conditions using the rich text information. There are many achievements regarding an English web-based corpus, but for research in China so far, the extraction of language features from web-related depression signals is still in a relatively primary stage. Objective The purpose of this study was to propose an effective approach for constructing a depression-domain lexicon. This lexicon will contain language features that could help identify social media users who potentially have depression. Our study also compared the performance of detection with and without our lexicon. Methods We autoconstructed a depression-domain lexicon using Word2Vec, a semantic relationship graph, and the label propagation algorithm. These two methods combined performed well in a specific corpus during construction. The lexicon was obtained based on 111,052 Weibo microblogs from 1868 users who were depressed or nondepressed. During depression detection, we considered six features, and we used five classification methods to test the detection performance. Results The experiment results showed that in terms of the F1 value, our autoconstruction method performed 1% to 6% better than baseline approaches and was more effective and steadier. When applied to detection models like logistic regression and support vector machine, our lexicon helped the models outperform by 2% to 9% and was able to improve the final accuracy of potential depression detection. Conclusions Our depression-domain lexicon was proven to be a meaningful input for classification algorithms, providing linguistic insights on the depressive status of test subjects. We believe that this lexicon will enhance early depression detection in people on social media. Future work will need to be carried out on a larger corpus and with more complex methods.

机译：背景技术根据2017年的世界卫生组织报告，中国每20人中几乎有一个患有抑郁症的患者。然而，由于观察缓慢，高成本和患者抗性，抑郁症的诊断通常难以临床检测。同时，随着社交网站的快速出现，人们倾向于分享日常生活并经常披露内心的感受，使得可以使用丰富的文本信息有效地识别心理条件。关于英国基于Web的语料库有许多成就，但到目前为止，在中国的研究中，网络相关抑郁信号的语言特征仍处于相对初级的阶段。目的是本研究的目的是提出一种构建抑郁域词典的有效方法。此词典将包含语言功能，可以帮助识别可能抑郁症的社交媒体用户。我们的研究还比较了检测的性能，没有我们的词典。方法我们使用Word2VEC，语义关系图和标签传播算法自动抵御凹陷域词典。这两种方法在施工期间在特定的语料库中结合良好。基于来自1868名用户的111,052微博微博，获得了111,052微博微博。在抑郁检测期间，我们考虑了六个功能，我们使用了五种分类方法来测试检测性能。结果实验结果表明，就F1值而言，我们的自电共振建筑方法比基线方法优于基线方法，更有效和更具效率。当应用于Logistic回归和支持向量机等检测模型时，我们的词典帮助模型优于2％至9％，并且能够提高潜在抑郁检测的最终精度。结论我们的抑郁域Lexicon被证明是对分类算法的有意义的输入，为试验科目的抑郁状态提供语言洞察。我们认为，这种词典将增强社交媒体人民的早期抑郁检测。未来的工作需要在更大的语料库中进行，并以更复杂的方法进行。

著录项

来源
《JMIR Medical Informatics》 |2020年第6期|共页
作者
Genghao Li; Bing Li; Langlin Huang; Sibing Hou;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
depression detectiondepression diagnosissocial mediaautomatic constructiondomain-specific lexicondepression lexiconlabel propagation;

机译：抑郁症检测诊断诊断媒体修饰施工组合特定词典词典lexiconlabelabel传播;

相似文献

外文文献
中文文献
专利

1. Improving the affective analysis in texts: Automatic method to detect affective intensity in lexicons based on Plutchik's wheel of emotions [J] . Molina Beltran Carlos, Segura Navarrete Alejandra Andrea, Vidal-Castro Christian, The Electronic Library . 2019,第6期

机译：改进文本中的情感分析：基于Plutchik情绪轮的自动检测词典情感强度的方法
2. Towards High Performance Text Mining: A TextRank-based Method for Automatic Text Summarization [J] . Shanshan Yu, Jindian Su, Pengfei Li, International journal of grid and high performance computing . 2016,第2期

机译：迈向高性能文本挖掘：一种基于TextRank的自动文本摘要方法
3. A method for automatic construction of learning contents in semantic web by a text mining approach [J] . Hsin-Chang Yang International journal of knowledge and learning . 2006,第1a2期

机译：一种基于文本挖掘的语义网学习内容自动构建方法
4. Research on Automatic Construction of Sentiment Lexicon Based on Bayesian Framework: Based on text sentiment classification [C] . Jianbo Liu, Yanyan Wang, Fulian Yin International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery . 2018

机译：基于贝叶斯框架的情感词典自动施工研究：基于文本情绪分类
5. Aspect-based opinion mining of product reviews in microblogs using most relevant frequent clusters of terms. [D] . Ejieh, Chukwuma. 2016

机译：使用最相关的频繁术语集群在微博中基于方面的产品评论意见挖掘。
6. Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports [O] . Ayoub Bagheri, T. Katrien J. Groenhof, Folkert W. Asselbergs, 2021

机译：主要心血管事件复发的自动预测：胸部X射线报告的文本挖掘研究
7. Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study [O] . Genghao Li, Bing Li, Langlin Huang, 2020

机译：基于微博的抑郁域词典自动构建：文本挖掘研究

Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study

摘要

著录项

相似文献

相关主题

期刊订阅