首页> 外文期刊>Journal of medical Internet research >Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach
【24h】

Developing Embedded Taxonomy and Mining Patients’ Interests From Web-Based Physician Reviews: Mixed-Methods Approach

机译:开发嵌入式分类法并从基于Web的医师评论中挖掘患者的兴趣:混合方法

获取原文
           

摘要

BackgroundWeb-based physician reviews are invaluable gold mines that merit further investigation. Although many studies have explored the text information of physician reviews, very few have focused on developing a systematic topic taxonomy embedded in physician reviews. The first step toward mining physician reviews is to determine how the natural structure or dimensions is embedded in reviews. Therefore, it is relevant to develop the topic taxonomy rigorously and systematically.ObjectiveThis study aims to develop a hierarchical topic taxonomy to uncover the latent structure of physician reviews and illustrate its application for mining patients’ interests based on the proposed taxonomy and algorithm.MethodsData comprised 122,716 physician reviews, including reviews of 8501 doctors from a leading physician review website in China (haodf.com), collected between 2007 and 2015. Mixed methods, including a literature review, data-driven-based topic discovery, and human annotation were used to develop the physician review topic taxonomy.ResultsThe identified taxonomy included 3 domains or high-level categories and 9 subtopics or low-level categories. The physician-related domain included the categories of medical ethics, medical competence, communication skills, medical advice, and prescriptions. The patient-related domain included the categories of the patient profile, symptoms, diagnosis, and pathogenesis. The system-related domain included the categories of financing and operation process. The F-measure of the proposed classification algorithm reached 0.816 on average. Symptoms (Cohen d =1.58, Δ u =0.216, t =229.75, and P <.001) are more often mentioned by patients with acute diseases, whereas communication skills (Cohen d =?0.29, Δ u =?0.038, t =?42.01, and P <.001), financing (Cohen d =?0.68, Δ u =?0.098, t =?99.26, and P <.001), and diagnosis and pathogenesis (Cohen d =?0.55, Δ u =?0.078, t =?80.09, and P <.001) are more often mentioned by patients with chronic diseases. Patients with mild diseases were more interested in medical ethics (Cohen d =0.25, Δ u 0.039, t =8.33, and P <.001), operation process (Cohen d =0.57, Δ u 0.060, t =18.75, and P <.001), patient profile (Cohen d =1.19, Δ u 0.132, t =39.33, and P <.001), and symptoms (Cohen d =1.91, Δ u =0.274, t =62.82, and P <.001). Meanwhile, patients with serious diseases were more interested in medical competence (Cohen d =?0.99, Δ u =?0.165, t =?32.58, and P <.001), medical advice and prescription (Cohen d =?0.65, Δ u =?0.082, t =?21.45, and P <.001), financing (Cohen d =?0.26, Δ u =?0.018, t =?8.45, and P <.001), and diagnosis and pathogenesis (Cohen d =?1.55, Δ u =?0.229, t =?50.93, and P <.001).ConclusionsThis mixed-methods approach, integrating literature reviews, data-driven topic discovery, and human annotation, is an effective and rigorous way to develop a physician review topic taxonomy. The proposed algorithm based on Labeled-Latent Dirichlet Allocation can achieve impressive classification results for mining patients’ interests. Furthermore, the mining results reveal marked differences in patients’ interests across different disease types, socioeconomic development levels, and hospital levels.
机译:基于Web的Background医生评论是宝贵的金矿,值得进一步研究。尽管许多研究已经探索了医师评论的文本信息,但很少有研究专注于开发嵌入医师评论的系统主题分类法。挖掘医师评论的第一步是确定自然结构或尺寸如何嵌入评论中。因此,有必要严格和系统地开发主题分类法。目的本研究旨在开发一种分层的主题分类法,以揭示医师评论的潜在结构,并根据提出的分类法和算法说明其在挖掘患者利益方面的应用。 122,716位医生评论,包括从2007年至2015年间从中国领先的医生评论网站(haodf.com)收集的8501位医生的评论。使用了多种方法,包括文献评论,基于数据驱动的主题发现和人工注释结果确定的分类法包括3个域或高级类别和9个子主题或低级类别。与医生相关的领域包括医学道德,医学能力,沟通技巧,医学建议和处方等类别。与患者相关的领域包括患者概况,症状,诊断和发病机制的类别。与系统相关的领域包括融资和运营过程的类别。所提出的分类算法的F度量平均达到0.816。急性疾病患者更经常提到症状(Cohen d = 1.58,Δu = 0.216,t = 229.75,P <.001),而沟通技巧(Cohen d =?0.29,Δu =?0.038,t = 42.01和P <.001),融资(Cohen d = 0.68,Δu = 0.098,t = 99.26和P <.001),诊断和发病机制(Cohen d = 0.55,Δu =慢性病患者更经常提及?0.078,t =?80.09和P <.001)。轻度疾病患者对医学伦理学(Cohen d = 0.25,Δu 0.039,t = 8.33,P <.001),手术过程(Cohen d = 0.57,Δu 0.060,t = 18.75,P <0.001)更感兴趣.001),患者资料(Cohen d = 1.19,Δu 0.132,t = 39.33和P <.001)和症状(Cohen d = 1.91,Δu = 0.274,t = 62.82和P <.001) 。同时,患有严重疾病的患者对医疗能力(Cohen d =?0.99,Δu =?0.165,t =?32.58,P <.001),医疗建议和处方更感兴趣(Cohen d =?0.65,Δu =?0.082,t =?21.45,P <.001),融资(Cohen d =?0.26,Δu =?0.018,t =?8.45,P <.001),以及诊断和发病机制(Cohen d = 1.55,Δu = 0.229,t = 50.93,P <.001)。结论这种混合方法,结合了文献综述,数据驱动的主题发现和人工注释,是一种有效且严格的方法。医师复习主题分类法。提出的基于标记潜在狄利克雷分配的算法可以实现令人印象深刻的分类结果,从而挖掘患者的兴趣。此外,挖掘结果显示,在不同疾病类型,社会经济发展水平和医院水平之间,患者的兴趣存在明显差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号