Investigator Name Recognition from Medical Journal Articles: A Comparative Study of SVM and Structural SVM

机译：医学期刊文章中研究者姓名的识别：SVM和结构SVM的比较研究

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automated extraction of bibliographic information from journal articles is key to the affordable creation and maintenance of citation databases, such as MEDLINE®. A newly required bibliographic field in this database is "Investigator Names": names of people who have contributed to the research addressed in the article, but who are not listed as authors. Since the number of such names is often large, several score or more, their manual entry is prohibitive. The automated extraction of these names is a problem in Named Entity Recognition (NER), but differs from typical NER due to the absence of normal English grammar in the text containing the names. In addition, since MEDLINE conventions require names to be expressed in a particular format, it is necessary to identify both first and last names of each investigator, an additional challenge. We seek to automate this task through two machine learning approaches: Support Vector Machine and structural SVM, both of which show good performance at the word and chunk levels. In contrast to traditional SVM, structural SVM attempts to learn a sequence by using contextual label features in addition to observational features. It outperforms SVM at the initial learning stage without using contextual observation features. However, with the addition of these contextual features from neighboring tokens, SVM performance improves to match or slightly exceed that of the structural SVM.

机译：从期刊文章中自动提取书目信息对于以可负担的方式创建和维护MEDLINE®等引文数据库至关重要。此数据库中新要求的书目字段是“调查者姓名”：为本文中涉及的研究做出过贡献但未列出为作者的人员的姓名。由于此类名称的数量通常很大，分数很高，甚至更高，因此手动输入它们是禁止的。这些名称的自动提取在命名实体识别（NER）中是一个问题，但是由于包含名称的文本中缺少常规的英语语法，因此与典型的NER有所不同。另外，由于MEDLINE约定要求名称必须以特定格式表示，因此有必要识别每个调查人员的名字和姓氏，这是另一个挑战。我们寻求通过两种机器学习方法来自动化该任务：支持向量机和结构化SVM，这两种方法在单词和块级都表现出良好的性能。与传统的SVM相比，结构化SVM尝试通过使用观察标记以及观察特征来学习序列。在不使用上下文观察功能的情况下，它在初始学习阶段的性能优于SVM。但是，通过从相邻令牌中添加这些上下文功能，SVM性能将提高到与结构SVM匹配或略有提高。

著录项

来源
《9th IAPR workshop on document analysis systems 2010》|2010年|p.116-123|共8页
会议地点 Boston MA(US);Boston MA(US)
作者
Xiaoli Zhang; Jie Zou; Daniel X. Le; George R. Thoma;
展开▼
作者单位

National Library of Medicine, Lister Hill National Center for Biomedical Communications, 8600 Rockville Pike, Bethesda, 20894;

National Library of Medicine, Lister Hill National Center for Biomedical Communications, 8600 Rockville Pike, Bethesda, 20894;

National Library of Medicine, Lister Hill National Center for Biomedical Communications, 8600 Rockville Pike, Bethesda, 20894;

National Library of Medicine, Lister Hill National Center for Biomedical Communications, 8600 Rockville Pike, Bethesda, 20894;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
investigator name; named entity recognition; support vector machine (svm); structural svm; document analysis; medline;

机译：调查员姓名；命名实体识别；支持向量机（svm）；结构svm;文件分析；医药线;

相似文献

外文文献
中文文献
专利

1. Scalable biomedical Named Entity Recognition: investigation of a database-supported SVM approach. [J] . Habib MS, Kalita J International journal of bioinformatics research and applications . 2010,第2期

机译：可扩展的生物医学命名实体识别：研究数据库支持的SVM方法。
2. Pattern Recognition in Digital Images using Multiclass SVM and Back Propagation Neural Network - A Comparative Study [J] . P. Pandi Selvi, T. Meyyappan Australian Journal of Basic and Applied Sciences . 2016,第2016期

机译：基于多类支持向量机和反向传播神经网络的数字图像模式识别
3. Comparative study of myoelectric pattern recognition using SVM and PNN classifiers based on wavelet analysis [J] . Firas AlOmari, Guohai Liu BioTechnology: An Indian Journal . 2015,第9期

机译：基于小波分析的SVM和PNN分类器对肌电模式识别的比较研究
4. Investigator Name Recognition from Medical Journal Articles: A Comparative Study of SVM and Structural SVM [C] . Xiaoli Zhang, Jie Zou, Daniel X. Le, IAPR workshop on document analysis systems . 2010

机译：来自医学期刊文章的调查员名称识别：SVM和结构SVM的比较研究
5. Connecting Genre-Based and Corpus-Driven Approaches in Research Articles: A Comparative Study of Moves and Lexical Bundles in Saudi and International Journals. [D] . Alamri, Basim M. 2017

机译：在研究文章中连接基于体裁和语料库驱动的方法：沙特和国际期刊中的举动和词汇捆绑的比较研究。
6. Dual Decomposed Learning with Factorwise Oracles for Structural SVMs of Large Output Domain [O] . Ian E.H. Yen, Xiangru Huang, Kai Zhong, -1

机译：大因数域结构SVM的因子分解Oracle双重分解学习
7. Scalable Biomedical Named Entity Recognition: Investigation of a Database-Supported SVM Approach [O] . Mona Soliman Habib, Jugal Kalita 2013

机译：可扩展的生物医学命名实体识别：数据库支持的sVm方法的研究

Investigator Name Recognition from Medical Journal Articles: A Comparative Study of SVM and Structural SVM

摘要

著录项

相似文献

相关主题

期刊订阅