Word Embeddings vs Word Types for Sequence Labeling: the Curious Case of CV Parsing

机译：Word Embeddings与顺序标签的Word类型：CV解析的奇怪情况

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We explore new methods of improving Curriculum Vitae (CV) parsing for German documents by applying recent research on the application of word embeddings in Natural Language Processing (NLP). Our approach integrates the word embeddings as input features for a probabilistic sequence labeling model that relies on the Conditional Random Field (CRF) framework. Best-performing word embeddings are generated from a large sample of German CVs. The best results on the extraction task are obtained by the model which integrates the word embeddings together with a number of hand-crafted features. The improvements are consistent throughout different sections of the target documents. The effect of the word embeddings is strongest on semi-structured, out-of-sample data.

机译：我们通过应用最近关于自然语言处理中的Word Embedings的应用程序来改善德国文档的课程（CV）解析的新方法（NLP）。我们的方法将eMbeddings作为输入特征集成为概率依赖于条件随机字段（CRF）框架的概率序列标记模型的输入特征。从德国CV的大量样本生成最佳性能的单词嵌入式。提取任务的最佳结果是通过集成单词嵌入式的模型以及多个手工制作的功能来获得。在目标文件的不同部分中，改进是一致的。嵌入式的效果在半结构化外的数据上最强。

著录项

来源
《Workshop on vector space Modeling for Natural Language Processing》|2015年||共6页
会议地点
作者
Melanie Tosik; Carsten L. Hansen; Gerard Goossen; Mihai Rotaru;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类线性空间理论（向量空间）;
关键词

相似文献

外文文献
中文文献
专利

1. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features [J] . Nikfarjam Azadeh, Sarker Abeed, OConnor Karen, Journal of the American Medical Informatics Association : . 2015,第3期

机译：来自社交媒体的药物警戒：使用带有词嵌入簇特征的序列标签来挖掘药物不良反应提及
2. Deep Contextualized Word Embeddings for Universal Dependency Parsing [J] . Liu Yijia, Che Wanxiang, Wang Yuxuan, ACM transactions on Asian language information processing . 2020,第1期

机译：通用关联解析的深度上下文化词嵌入
3. Learning context-dependent word embeddings based on dependency parsing [J] . Ke Yan, Jie Chen, Wenhao Zhu, International journal of infomation technology and management . 2020,第4期

机译：基于依赖性解析学习上下文依赖词eMbedingings
4. Word Embeddings vs Word Types for Sequence Labeling: the Curious Case of CV Parsing [C] . Melanie Tosik, Carsten L. Hansen, Gerard Goossen, 1st Workshop on vector space Modeling for Natural Language Processing 2015 . 2015

机译：单词嵌入与单词类型的序列标记：CV解析的奇怪案例
5. Parse decoration of the word sequence in the speech-to-text machine-translation pipeline. [D] . Kahn, Jeremy G. 2010

机译：在语音转文本机器翻译管道中解析单词序列的修饰。
6. Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings [O] . Hong-Jie Dai, Chu-Hsien Su, Chi-Shin Wu 2020

机译：通过不同序列标记模型和Word Embedings的级联架构在电子健康记录中的不良药物事件和药物提取
7. A Type Collection of CVCCVCCVC Words [O] . Eckler A. Ross 2005

机译：CVCCVCCVC词的类型集合

Word Embeddings vs Word Types for Sequence Labeling: the Curious Case of CV Parsing

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅