首页> 外文会议>International conference on advances in computing, communications and informatics >A comparative study of segment representation for biomedical named entity recognition
【24h】

A comparative study of segment representation for biomedical named entity recognition

机译:用于生物医学命名实体识别的片段表示的比较研究

获取原文

摘要

Biomedical Named Entity Recognition (Bio-NER) is an important subtask of Biomedical Text Mining (BioTM), where the performance of further tasks, such as relation extraction, protein-protein interaction and hypothesis generation depend on the performance of Bio-NER. Bio-NER involves determining the biomedical named entities, such as DNA, RNA, cell types, gene and protein present in the biomedical research articles. Annotating the dataset for training the classifier to recognize and classify named entities is the crucial task in BioNER. Segment representation (SR) is an efficient way of annotating Biomedical Named Entities (BioNEs) within a sentence to differentiate them from non-BioNEs. In this paper, we have used Support Vector Machines (SVMs) and Conditional Random fields (CRFs) to train different BioNER models with the benchmark JNLPBA 2004 and i2b2 2010 shared task dataset using different SRs. The performance of SR models shows that more complex the model worse performance of f-score.
机译:生物医学命名实体识别(Bio-NER)是生物医学文本挖掘(BioTM)的重要子任务,其中其他任务(例如关系提取,蛋白质-蛋白质相互作用和假设生成)的执行取决于Bio-NER的性能。 Bio-NER涉及确定生物医学名称的实体,例如生物医学研究文章中存在的DNA,RNA,细胞类型,基因和蛋白质。注释数据集以训练分类器识别和分类命名实体是BioNER中的关键任务。段表示法(SR)是注释句子中生物医学命名实体(BioNEs)的一种有效方法,可将其与非BioNEs区别开来。在本文中,我们已使用支持向量机(SVM)和条件随机字段(CRF)来使用基准JNLPBA 2004和i2b2 2010共享任务数据集使用不同的SR训练不同的BioNER模型。 SR模型的性能表明,模型越复杂,f得分的性能越差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号