BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature

Cheng-Ju Kuo; Maurice HT Ling; Kuan-Ting Lin; Chun-Nan Hsu

首页> 外文期刊>BMC Bioinformatics >BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature

【24h】

BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature

机译：Biadi：一种机器学习方法，用于识别生物文学中的缩写和定义

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background To automatically process large quantities of biological literature for knowledge discovery and information curation, text mining tools are becoming essential. Abbreviation recognition is related to NER and can be considered as a pair recognition task of a terminology and its corresponding abbreviation from free text. The successful identification of abbreviation and its corresponding definition is not only a prerequisite to index terms of text databases to produce articles of related interests, but also a building block to improve existing gene mention tagging and gene normalization tools. Results Our approach to abbreviation recognition (AR) is based on machine-learning, which exploits a novel set of rich features to learn rules from training data. Tested on the AB3P corpus, our system demonstrated a F-score of 89.90% with 95.86% precision at 84.64% recall, higher than the result achieved by the existing best AR performance system. We also annotated a new corpus of 1200 PubMed s which was derived from BioCreative II gene normalization corpus. On our annotated corpus, our system achieved a F-score of 86.20% with 93.52% precision at 79.95% recall, which also outperforms all tested systems. Conclusion By applying our system to extract all short form-long form pairs from all available PubMed s, we have constructed BIOADI. Mining BIOADI reveals many interesting trends of bio-medical research. Besides, we also provide an off-line AR software in the download section on http://bioagent.iis.sinica.edu.tw/BIOADI/ .

机译：背景技术为知识发现和信息策策自动处理大量的生物学文献，文本挖掘工具正成为必不可少的。缩写识别与ner有关，可以被视为术语的对识别任务及其与自由文本的相应缩写。缩写的成功识别及其相应的定义不仅是索引文本数据库的前提条款，以产生相关兴趣的文章，而且是改善现有基因提及标记和基因标准化工具的构建块。结果我们的缩写识别方法（AR）是基于机器学习，它利用一组新颖的丰富功能来从训练数据中学习规则。在AB3P语料库上进行测试，我们的系统展示了89.90％的F分，精度为84.64％的召回，高于现有最佳AR性能系统所实现的结果。我们还注释了一种新的1200个Pubmed S的语料库，它来自生物重建II基因标准化语料库。在我们的注释语料库中，我们的系统达到了86.20％的F分，93.52％的精度在79.95％的召回，这也优于所有测试系统。结论通过将我们的系统应用于从所有可用的PubMed S中提取所有短的形式的长形对，我们建造了Biadi。矿业比罗揭示了生物医学研究的许多有趣趋势。此外，我们还在下载部分中提供了一个离线AR软件http://bioagent.iis.sinica.edu.tw/bioadi/。

著录项

来源
《BMC Bioinformatics》 |2009年第15期|共页
作者
Cheng-Ju Kuo; Maurice HT Ling; Kuan-Ting Lin; Chun-Nan Hsu;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Machine learning with naturally labeled data for identifying abbreviation definitions [J] . Lana Yeganova, Donald C Comeau, W John Wilbur BMC Bioinformatics . 2011,第SUPPLEMENTa3期

机译：带有自然标记数据的机器学习，用于识别缩写定义
2. Machine learning with naturally labeled data for identifying abbreviation definitions [J] . Lana Yeganova, Donald C Comeau, W John Wilbur BMC Bioinformatics . 2011,第SUPPLEMENTa3期

机译：带有自然标记数据的机器学习，用于识别缩写定义
3. Predicting Chinese Abbreviations from Definitions: An Empirical Learning Approach Using Support Vector Regression [J] . Xu Sun, Hou-Feng Wang, Bo Wang Journal of Computer Science & Technology . 2008,第4期

机译：根据定义预测中文缩写：使用支持向量回归的经验学习方法
4. Identifying Abbreviation Definitions Machine Learning with Naturally Labeled Data [C] . Yeganova Lana, Comeau Donald C., Wilbur W. John Ninth International Conference on Machine Learning and Applications . 2010

机译：使用自然标记的数据识别缩写定义机器学习
5. Beating the Book: A Machine Learning Approach to Identifying an Edge in NBA Betting Markets [D] . Dotan, Guy. 2020

机译：击败这本书：一种机器学习方法，可以识别NBA博彩市场边缘
6. BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature [O] . Cheng-Ju Kuo, Maurice HT Ling, Kuan-Ting Lin, 2009

机译：BIOADI：一种在生物学文献中识别缩写和定义的机器学习方法
7. BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature [O] . 2009

机译：BIOADI：一种在生物学文献中识别缩写和定义的机器学习方法

BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature

摘要

著录项

相似文献

相关主题

期刊订阅