Supporting the Curation of Biological Databases with Reusable Text Mining

机译：使用可重复使用的文本挖掘支持生物数据库的策委

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Curators of biological databases transfer knowledge from scientific publications, a laborious and expensive manual process. Machine learning algorithms can reduce the workload of curators by filtering relevant biomedical literature, though their widespread adoption will depend on the availability of intuitive tools that can be configured for a variety of tasks. We propose a new method for supporting curators by means of document categorization, and describe the architecture of a curator-oriented tool implementing this method using techniques that require no computational linguistic or programming expertise. To demonstrate the feasibility of this approach, we prototyped an application of this method to support a real curation task: identifying PubMed abstracts that contain allergen cross-reactivity information. We tested the performance of two different classifier algorithms (CART and ANN), applied to both composite and single-word features, using several feature scoring functions. Both classifiers exceeded our performance targets, the ANN classifier yielding the best results. These results show that the method we propose can deliver the level of performance needed to assist database curation.

机译：生物数据库的策展人从科学出版物，艰苦昂贵的手动过程转移知识。机器学习算法可以通过过滤相关的生物医学文献来减少策略的工作量，尽管它们的广泛采用取决于可以为各种任务配置的直观工具的可用性。我们提出了一种通过文档分类支持策展人的新方法，并描述了使用不需要计算语言或编程专业知识的技术实现该方法的策略导向工具的体系结构。为了展示这种方法的可行性，我们将这种方法的应用原型出现了支持真实策划任务：识别包含过敏原交叉反应性信息的PubMed摘要。我们使用多个特征评分功能测试了应用于两个不同分类器算法（推车和ANN）的性能，应用于复合和单词特征。两个分类器都超过了我们的性能目标，ANN分类器产生了最佳结果。这些结果表明，我们提出的方法可以提供辅助数据库策策所需的性能水平。

著录项

来源
《International Conference on Genome Informatics》|2005年||共13页
会议地点
作者
Olivo Miotto; Tin Wee Tan; Vladimir Brusi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类分子生物学;
关键词
biological databases; database curation; text mining; machine learnin;

机译：生物数据库;数据库策策;文本挖掘;机器学习;

相似文献

外文文献
中文文献
专利

1. mycoCLAP, the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support [J] . Adrian Tsang, Carol Nyaga, Erin McDonnell, Database . 2015,第2010期

机译：mycoCLAP，真菌来源的特征化木质纤维素活性蛋白数据库：资源和文本挖掘管理支持
2. Biomedical text summarization to support genetic database curation: using Semantic MEDLINE to create a secondary database of genetic information. [J] . Workman TE, Fiszman M, Hurdle JF, Journal of the Medical Library Association : . 2010,第4期

机译：生物医学文本摘要以支持遗传数据库管理：使用语义MEDLINE创建遗传信息的辅助数据库。
3. Erratum to: Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature [J] . Komandur Elayavilli Ravikumar, Kavishwar B. Wagholikar, Dingcheng Li, BMC Bioinformatics . 2016,第1期

机译：勘误到：文本挖掘有助于数据库管理-从生物医学文献中提取突变-疾病关联
4. Supporting the Curation of Biological Databases with Reusable Text Mining [C] . Olivo Miotto, Tin Wee Tan, Vladimir Brusi International Conference on Genome Informatics . 2005

机译：使用可重复使用的文本挖掘支持生物数据库的策委
5. Using Text Mining to Accelerate Automatic Curation of Biomedical Databases [D] . Jain, Suvir. 2015

机译：使用Text Mining来加速生物医学数据库的自动策序
6. mycoCLAP the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support [O] . Kimchi Strasser, Erin McDonnell, Carol Nyaga, 2015

机译：mycoCLAP用于表征真菌来源的木质纤维素活性蛋白的数据库：资源和文本挖掘管理支持
7. mycoCLAP, the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support [O] . Kimchi Strasser, Erin McDonnell, Carol Nyaga, 2015

机译：Mycoclap，具有真菌起源的表征木质纤维素活性蛋白的数据库：资源和文本采矿策划支持

Supporting the Curation of Biological Databases with Reusable Text Mining

摘要

著录项

相似文献

相关主题

期刊订阅