首页> 外文会议>Conference on empirical methods in natural language processing >Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets

【24h】

Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets

机译：BiLSTM-CRF对不连续标签集进行生物医学命名实体识别的边缘可能性训练

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Extracting typed entity mentions from text is a fundamental component to language understanding and reasoning. While there exist substantial labeled text datasets for multiple subsets of biomedical entity types-such as genes and proteins, or chemicals and diseases-it is rare to find large labeled datasets containing labels for all desired entity types together. This paper presents a method for training a single CRF extractor from multiple datasets with disjoint or partially overlapping sets of entity types. Our approach employs marginal likelihood training to insist on labels that arc present in the data, while filling in "missing labels". This allows us to leverage all the available data within a single model. In experimental results on the Biocre-ative Ⅴ CDR (chemicals/diseases), Biocreative Ⅵ ChemProt (chemicals/proteins) and Med-Mentions (19 entity types) datasets, we show that joint training on multiple datasets improves NER F1 over training in isolation, and our methods achieve state-of-the-art results.

机译：从文本中提取类型化的实体提及是语言理解和推理的基本组成部分。虽然存在大量生物医学实体类型的子集（例如基因和蛋白质，化学药品和疾病）的标记文本数据集，但是很少找到包含所有所需实体类型标记的大型标记数据集。本文提出了一种从实体类型不相交或部分重叠的多个数据集中训练单个CRF提取器的方法。我们的方法采用边际似然训练来坚持数据中存在的弧形标签，同时填写“缺失标签”。这使我们能够利用单个模型中的所有可用数据。在关于生物创造力ⅤCDR（化学物质/疾病），生物创造力ⅥChemProt（化学物质/蛋白质）和Med-Mentions（19种实体类型）数据集的实验结果中，我们表明，与单独训练相比，在多个数据集上联合训练可提高NER F1 ，我们的方法可以达到最新的结果。

著录项

来源
《Conference on empirical methods in natural language processing 》|2018年|2824-2829|共6页
会议地点
作者
Nathan Greenberg; Trapit Bansal; Patrick Verga; Andrew McCallum;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. DTranNER: biomedical named entity recognition with deep learning-based label-label transition model [J] . S. K. Hong, Jae-Gil Lee BMC Bioinformatics . 2020 ,第1期

机译：DTRANNER：生物医学命名实体识别与基于深度学习的标签标签转换模型
2. Improving the recall of biomedical named entity recognition with label re-correction and knowledge distillation [J] . Zhou Huiwei, Liu Zhe, Lang Chengkun, BMC Bioinformatics . 2021 ,第1期

机译：用标签重新校正和知识蒸馏改善生物医学命名实体识别的召回
3. A Biomedical Named Entity Recognition Using Machine Learning Classifiers and Rich Feature Set [J] . Ahmed Sultan Al-Hegami, Ameen Mohammed Farea Othman, Fuad Tarbosh Bagash International journal of computer science and network security . 2017 ,第1期

机译：使用机器学习分类器和丰富特征集的生物医学命名实体识别
4. Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets [C] . Nathan Greenberg, Trapit Bansal, Patrick Verga, Conference on empirical methods in natural language processing . 2018

机译：Bilstm-CRF对生物医学命名实体识别的边缘似然训练
5. Unsupervised Biomedical Named Entity Recognition [D] . Ghiasvand, Omid. 2017

机译：无监督的生物医学命名实体识别
6. DTranNER: biomedical named entity recognition with deep learning-based label-label transition model [O] . S. K. Hong, Jae-Gil Lee 2020

机译：DTranNER：具有基于深度学习的标签-标签转换模型的生物医学命名实体识别
7. DTranNER: biomedical named entity recognition with deep learning-based label-label transition model [O] . S. K. Hong, Jae-Gil Lee 2020

机译：DTRANNER：生物医学命名实体识别与基于深度学习的标签标签转换模型

Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets

摘要

著录项

相似文献

相关主题

期刊订阅