Cross-Domain and Semisupervised Named Entity Recognition in Chinese Social Media: A Unified Model

Jingjing Xu; Hangfeng He; Xu Sun; Xuancheng Ren; Sujian Li

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Cross-Domain and Semisupervised Named Entity Recognition in Chinese Social Media: A Unified Model

【24h】

Cross-Domain and Semisupervised Named Entity Recognition in Chinese Social Media: A Unified Model

机译：中国社交媒体中跨域和半监督的命名实体识别：统一模型

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Named entity recognition (NER) in Chinese social media is an important, but challenging task because Chinese social media language is informal and noisy. Most previous methods on NER focus on in-domain supervised learning, which is limited by scarce annotated data in social media. In this paper, we present that sufficient corpora in formal domains and massive unannotated text can be combined to improve the NER performance in social media. We propose a unified model which can learn from out-of-domain corpora and in-domain unannotated text. The unified model is composed of two parts. One is for cross-domain learning and the other is for semisupervised learning. Cross-domain learning can learn out-of-domain information based on domain similarity. Semisupervised learning can learn in-domain unannotated information by self-training. Experimental results show that our unified model yields a 9.57% improvement over strong baselines and achieves the state-of-the-art performance.

机译：中文社交媒体中的命名实体识别（NER）是一项重要但具有挑战性的任务，因为中文社交媒体语言是非正式且嘈杂的。 NER上的大多数先前方法都集中于域内监督学习，这受到社交媒体中稀缺的带注释数据的限制。在本文中，我们提出可以将正式领域中足够的语料库和大量无注释的文本组合起来，以提高NER在社交媒体中的表现。我们提出一个可以从域外语料库和域内无注释文本中学习的统一模型。统一模型由两部分组成。一种用于跨域学习，另一种用于半监督学习。跨域学习可以基于域相似度来学习域外信息。半监督学习可以通过自我训练来学习域内未注释的信息。实验结果表明，我们的统一模型比强基准提高了9.57％，并达到了最先进的性能。

著录项

来源
《Audio, Speech, and Language Processing, IEEE/ACM Transactions on》 |2018年第11期|2142-2152|共11页
作者
Jingjing Xu; Hangfeng He; Xu Sun; Xuancheng Ren; Sujian Li;
展开▼
作者单位

MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, China;

Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA;

MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, China;

MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, China;

MOE Key Laboratory of Computational Linguistics, Peking University, Beijing, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Training; Social network services; Task analysis; Semisupervised learning; Kernel; Speech processing; Predictive models;

机译：培训;社交网络服务;任务分析;半监督学习;内核;语音处理;预测模型;

相似文献

外文文献
中文文献
专利

1. Event identification in web social media through named entity recognition and topic modeling [J] . Konstantinos N. Vavliakis, Andreas L. Symeonidis, Pericles A. Mitkas Data & Knowledge Engineering . 2013,第nova期

机译：通过命名实体识别和主题建模在网络社交媒体中进行事件识别
2. New approach for Arabic named entity recognition on social media based on feature selection using genetic algorithm [J] . Brahim Ait Benali, Soukaina Mihi, Ismail El Bazi, International Journal of Electrical and Computer Engineering . 2021,第2期

机译：基于特征选择的阿拉伯语命名实体识别的新方法使用遗传算法
3. Hierarchical self-adaptation network for multimodal named entity recognition in social media [J] . Tian Yu, Sun Xian, Yu Hongfeng, Neurocomputing . 2021,第Juna7期

机译：社交媒体中多式联名为实体识别的分层自适应网络
4. A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media [C] . Hangfeng He, Xu Sun AAAI Conference on Artificial Intelligence . 2017

机译：中国社交媒体中跨域和半监督名称实体识别的统一模型
5. An Application of Natural Language Processing: Named Entity Recognition with BLSTM in Chinese Corpora [D] . Mao, Lihui 2019

机译：自然语言处理的应用：BLSTM在中文语料库中的命名实体识别
6. A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records [O] . Xiaoling Cai, Shoubin Dong, Jinlong Hu 2019

机译：结合语音和自我匹配注意力的深度学习模型用于中国电子病历的命名实体识别
7. F-Score Driven Max Margin Neural Network for Named Entity Recognition in Chinese Social Media [O] . He, Hangfeng, Sun, Xu 2017

机译：用于命名实体识别的F-score驱动最大边缘神经网络中国社交媒体

Cross-Domain and Semisupervised Named Entity Recognition in Chinese Social Media: A Unified Model

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅