首页> 外文期刊>Computers, Materials & Continua >Biomedical Event Extraction Using a New Error Detection Learning Approach Based on Neural Network
【24h】

Biomedical Event Extraction Using a New Error Detection Learning Approach Based on Neural Network

机译:基于神经网络的新误差检测方法提取生物医学事件提取

获取原文
获取原文并翻译 | 示例
       

摘要

Supervised machine learning approaches are effective in text mining, but their success relies heavily on manually annotated corpora. However, there are limited numbers of annotated biomedical event corpora, and the available datasets contain insufficient examples for training classifiers; the common cure is to seek large amounts of training samples from unlabeled data, but such data sets often contain many mislabeled samples, which will degrade the performance of classifiers. Therefore, this study proposes a novel error data detection approach suitable for reducing noise in unlabeled biomedical event data. First, we construct the mislabeled dataset through error data analysis with the development dataset. The sample pairs' vector representations are then obtained by the means of sequence patterns and the joint model of convolutional neural network and long short-term memory recurrent neural network. Following this, the sample identification strategy is proposed, using error detection based on pair representation for unlabeled data. With the latter, the selected samples are added to enrich the training dataset and improve the classification performance. In the BioNLP Shared Task GENIA, the experiments results indicate that the proposed approach is competent in extract the biomedical event from biomedical literature. Our approach can effectively filter some noisy examples and build a satisfactory prediction model.
机译:监督机器学习方法在文本挖掘中是有效的,但他们的成功依赖于手动注释的语料库。但是,有限数量的注释生物医学事件语料库,可用的数据集包含培训分类器的不足例子;常见的治疗方法是寻求大量的来自未标记数据的培训样本,但这种数据集通常包含许多误标标样本,这将降低分类器的性能。因此,本研究提出了一种新的误差数据检测方法,适用于降低未标记的生物医学事件数据中的噪声。首先,我们通过使用开发数据集进行错误数据分析来构造误标标数据集。然后通过序列图案和卷积神经网络的联合模型和长短期记忆经常性神经网络获得样品对向量表示。在此之后,提出了基于对未标记数据的对表示的错误检测来提出样本识别策略。使用后者,添加所选样本以丰富训练数据集并提高分类性能。在BiONLP共享任务期间,实验结果表明,拟议的方法是从生物医学文献中提取生物医学事件的能力。我们的方法可以有效地过滤一些嘈杂的示例并构建令人满意的预测模型。

著录项

  • 来源
    《Computers, Materials & Continua》 |2020年第2期|923-941|共19页
  • 作者单位

    College of the Computer Science and Technology Jilin University Changchun 130012 China College of the Computer Science and Technology Inner Mongolia University for Nationalities Tongliao 028000 China;

    College of the Computer Science and Technology Jilin University Changchun 130012 China College of the Computer Science and Technology Inner Mongolia University for Nationalities Tongliao 028000 China;

    College of the Computer Science and Technology Jilin University Changchun 130012 China;

    College of the Computer Science and Technology Inner Mongolia University for Nationalities Tongliao 028000 China;

    School of Science and Technology Yanching Institute of Technology Langfang 065202 China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Biomedical event extraction; pair representation; error data detection; sample identification;

    机译:生物医学事件提取;对代表;错误数据检测;样品识别;
  • 入库时间 2022-08-18 22:05:39

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号