A Word Similarity Feature-based Semi-supervised Approach for Named Entity Recognition

机译：基于词相似度特征的半监督命名实体识别方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Named Entity Recognition (NER) is an important branch of Natural Language Processing (NLP). Among the existed NER methods, one of the most advanced and commonly deployed approach is the Long Short Term Memory with a Conditional Random Field layer (LSTM-CRF). However, this supervised method generally requires a large number of labeled corpuses, which is very limited regarding the texts in drug patent of this study. Bearing this in mind, a word similarity feature-based semi-supervised NER approach is proposed in this study. The feature of word similarity with regard to various types of entities are firstly extracted from word embedding to form similarity constraint. Then they are combined with the features computed by supervised LSTM. Finally, the tagged results are obtained through the CRF layer. By introducing the similarity feature of word embedding to LSTM-CRF model, the proposed method can greatly reduce the untagged cases in a large amount of similar entities. Experimental studies demonstrated that the proposed method performs obvious advantages in both the accuracy and comprehensiveness when compared with the traditional baseline model and other semi-supervised methods.

机译：命名实体识别（NER）是自然语言处理（NLP）的重要分支。在现有的NER方法中，最先进且最常用的方法之一是带有条件随机字段层（LSTM-CRF）的长短期内存。但是，这种受监督的方法通常需要大量带标记的语料库，这对于本研究的药物专利中的文本而言是非常有限的。考虑到这一点，本研究提出了一种基于词相似度特征的半监督NER方法。首先从词嵌入中提取出针对各种类型实体的词相似度特征，以形成相似度约束。然后将它们与受监督的LSTM计算出的特征相结合。最后，通过CRF层获得标记的结果。通过将词嵌入的相似性特征引入LSTM-CRF模型，该方法可以大大减少大量相似实体中未加标签的情况。实验研究表明，与传统的基线模型和其他半监督方法相比，该方法在准确性和综合性上均具有明显的优势。

著录项

来源
《International Conference on System Science and Engineering》|2019年|136-141|共6页
会议地点
作者
Ze Wang; Zhongyang Han; Jun Zhao; Wei Wang; Feng Jin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Computer architecture; Computational modeling; Tagging; Task analysis; Euclidean distance; Drugs;

机译：特征提取;计算机体系结构;计算建模;标注;任务分析;欧氏距离;药物;

相似文献

外文文献
中文文献
专利

1. Named entity recognition: a semi-supervised learning approach [J] . H. Sintayehu, G. S. Lehal International Journal of Information Technology . 2021,第4期

机译：命名实体识别：半监督学习方法
2. Isarn Dharma Word Segmentation Using a Statistical Approach with Named Entity Recognition [J] . Somsap Sittichai, Seresangtakul Pusadee ACM transactions on Asian and low-resource language information processing . 2020,第2期

机译：ISARN DHARMA Word Seation使用统计方法指定实体识别
3. Chinese word segmentation and named entity recognition: A pragmatic approach [J] . Gao JF, Li M, Wu A, Computational linguistics . 2005,第4期

机译：中文分词与命名实体识别：一种务实的方法
4. A Word Similarity Feature-based Semi-supervised Approach for Named Entity Recognition [C] . Ze Wang, Zhongyang Han, Jun Zhao, International Conference on System Science and Engineering . 2019

机译：基于单词相似性的半导体的命名实体识别方法
5. Semi-supervised Named Entity Recognition: Learning to recognize 100 entity types with little supervision [D] . Nadeau, David. 2007

机译：半监督的命名实体识别：在很少的监督下学习识别100种实体类型
6. Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations [O] . Min Zhang, Guohua Geng, Jing Chen 2020

机译：使用语言模型表示的嵌入式识别命名实体识别的半监控双向短期内存和条件随机字段模型
7. Semi-Supervised Bio-Named Entity Recognition with Word-Codebook Learning [O] . 2012

机译：使用词码本学习的半监督生物命名实体识别

A Word Similarity Feature-based Semi-supervised Approach for Named Entity Recognition

摘要

著录项

相似文献

相关主题

期刊订阅