Automated Classification of Online Sources for Infectious Disease Occurrences Using Machine-Learning-Based Natural Language Processing Approaches

机译：使用基于机器学习的自然语言处理方法自动分类用于传染病的传染病出现

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Collecting valid information from electronic sources to detect the potential outbreak of infectious disease is time-consuming and labor-intensive. The automated identification of relevant information using machine learning is necessary to respond to a potential disease outbreak. A total of 2864 documents were collected from various websites and subsequently manually categorized and labeled by two reviewers. Accurate labels for the training and test data were provided based on a reviewer consensus. Two machine learning algorithms—ConvNet and bidirectional long short-term memory (BiLSTM)—and two classification methods—DocClass and SenClass—were used for classifying the documents. The precision, recall, F1, accuracy, and area under the curve were measured to evaluate the performance of each model. ConvNet yielded higher average, min, and max accuracies (87.6%, 85.2%, and 91.1%, respectively) than BiLSTM with DocClass, while BiLSTM performed better than ConvNet with SenClass with average, min, and max accuracies of 92.8%, 92.6%, and 93.3%, respectively. The performance of BiLSTM with SenClass yielded an overall accuracy of 92.9% in classifying infectious disease occurrences. Machine learning had a compatible performance with a human expert given a particular text extraction system. This study suggests that analyzing information from the website using machine learning can achieve significant accuracies in the presence of abundant articles/documents.

机译：从电子来源收集有效信息以检测传染病的潜在爆发是耗时和劳动密集型的。使用机器学习的相关信息的自动识别是响应潜在的疾病爆发。从各种网站收集共2864份文件，随后由两个审稿人手动分类和标记。根据审阅人员协商一致，提供了培训和测试数据的准确标签。两台机器学习算法-Cromnet和双向长期短期内存（BILSTM） - 以及两个分类方法 - Docclass和Senclass - 用于对文档进行分类。测量曲线下的精度，召回，F1，精度和面积，以评估每个模型的性能。 Convnet平均，最小和最大精度，分别比Bilstm与Docclass的Bilstm产生更高，最小和最大的精度（87.6％，85.2％和91.1％），而Bilstm比Convnet更好地表现出平均，最小，最大精度为92.8％，92.6％分别为93.3％。在分类的传染病出现的情况下，Bilstm与Sencrass的性能产生了92.9％的整体准确性。机器学习具有兼容的性能，具有特定文本提取系统的人类专家。本研究表明，使用机器学习分析来自网站的信息，可以在存在丰富的文章/文件中实现显着的准确性。

著录项

期刊名称 International Journal of Environmental Research and Public Health
作者
Mira Kim; Kyunghee Chae; Seungwoo Lee; Hong-Jun Jang; Sukil Kim;
展开▼
作者单位

展开▼
年(卷),期 2020(17),24
年度 2020
页码 9467
总页数 13
原文格式 PDF
正文语种
中图分类公共卫生工程;
关键词
machine learning; infectious disease; public health surveillance; online document; classification;

机译：机器学习;传染病;公共卫生监测;在线文件;分类;
入库时间 2022-08-21 12:00:16

相似文献

外文文献
中文文献
专利

1. Microbial phenomics information extractor (MicroPIE): a natural language processing tool for the automated acquisition of prokaryotic phenotypic characters from text sources [J] . Jin Mao, Lisa R. Moore, Carrine E. Blank, BMC Bioinformatics . 2016,第1期

机译：微生物表观信息提取器（MicroPIE）：一种自然语言处理工具，用于从文本源自动获取原核表型字符
2. Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification [J] . Journal of digital imaging: the official journal of the Society for Computer Applications in Radiology . 2020,第1期

机译：自动检测放射学报告，需要使用自然语言处理的后续成像功能工程和机器学习分类
3. Automated classification of building information modeling (BIM) case studies by BIM use based on natural language processing (NLP) and unsupervised learning [J] . Jung Namcheol, Lee Ghang Advanced engineering informatics . 2019,第AUGa期

机译：通过基于自然语言处理（NLP）和无监督学习的BIM使用，自动分类建筑信息模型（BIM）案例研究
4. Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches [C] . Imre Solti, Colin R. Cooke, Fei Xia, IEEE International Conference on Bioinformatics and Biomedicine Workshop . 2009

机译：急性肺损伤的放射学报告的自动分类：基于关键字和机器的自然语言处理方法的比较
5. Performance evaluation of a natural language processing tool to extract infectious disease problems. [D] . Mandel, Hannah L. 2013

机译：评估传染病问题的自然语言处理工具的性能评估。
6. Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches [O] . Imre Solti, Colin R. Cooke, Fei Xia, -1

机译：放射学报告的急性肺损伤的自动分类：基于和机械关键字的比较学习自然语言处理途径
7. Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches [O] . Imre Solti, Colin R. Cooke, Fei Xia, 2015

机译：急性肺损伤放射学报告的自动分类：基于关键词和机器学习的自然语言处理方法的比较

Automated Classification of Online Sources for Infectious Disease Occurrences Using Machine-Learning-Based Natural Language Processing Approaches

摘要

著录项

相似文献

相关主题

期刊订阅