首页> 外文期刊>Procedia Computer Science >Hybrid System for Information Extraction from Social Media Text: Drug Abuse Case Study
【24h】

Hybrid System for Information Extraction from Social Media Text: Drug Abuse Case Study

机译:来自社交媒体文本的信息提取混合系统:药物滥用案例研究

获取原文
获取外文期刊封面目录资料

摘要

Social media are becoming widely used in the healthcare field as a patients-caregivers communication tool giving birth to new sources of information rich with the knowledge that may improve this field. Therefore, social media data analysis becomes a real business requirement for healthcare industrials and data scientists.However, regarding their complexity and unstructured character, existing natural language processing tools cannot succeed their exploitation. In the literature, a wide range of approaches appeared based on dictionaries, linguistic patterns and machine learning having their strengths and weaknesses.In this work, we propose a hybrid system combining the above approaches by taking the advantage of each of them to extract structured and salient drug abuse information from health-related tweets. We improve the system accuracy by real time update of the domain dictionary. We collected 1000000 tweets and we conducted different experiments showing the advantage of hybridization on efficient information extraction from social media data.
机译:社交媒体在医疗保健领域中被广泛使用,因为患者 - 护理人员交流工具,以获得可能改善这一领域的知识的新信息来源。因此,社交媒体数据分析成为医疗保健工业和数据科学家的真正业务需求。然而,对于他们的复杂性和非结构化的性格,现有的自然语言处理工具无法取得成功。在文献中,基于词典,语言模式和机器学习具有它们的优势和缺点的广泛方法。在这项工作中,我们提出了一种混合系统,通过将它们中的每一个来提取构造和所提取的杂交系统来组合上述方法突出的药物滥用来自与健康相关的推文的信息。我们通过实时更新域字典来提高系统准确性。我们收集了100万推文,我们进行了不同的实验,展示了杂交关于社交媒体数据的有效信息提取的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号