Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

André SANTOS; Regina NOGUEIRA; Anália LOUREN?O

首页> 外文期刊>Advances in Distributed Computing And Artificial Intelligence Journal >Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

【24h】

Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

机译：将文本挖掘框架应用于从生物技术领域的科学文献中提取数值参数

获取原文

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

机译：科学出版物是传播用于废水处理的生物技术领域信息的主要工具。实际上，新的研究范式和高通量技术的应用大大提高了发表率。问题是手动管理变得更加困难，容易出错并且耗时，导致信息丢失和知识获取效率低下。结果，研究成果几乎没有到达工程师手中，这妨碍了用于优化生物技术系统稳定性和性能的数学模型的校准。在这种情况下，我们已经开发了一种基于文本挖掘技术的数据管理工作流，以从科学文献中提取数值参数，并将其应用于生物技术领域。建立了一个工作流程来处理与废水有关的物品，其主要目的是确定本文中提到的理化参数。这项工作描述了工作流程的实现，确定了整个过程中的成就和当前的局限性，并介绍了从50个全文本文档集中获得的结果。

著录项

来源
《Advances in Distributed Computing And Artificial Intelligence Journal》 |2012年第1期|共8页
作者
André SANTOS; Regina NOGUEIRA; Anália LOUREN?O;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Pitfalls in applying text mining to scientific literature [J] . Jean-Marc Neefs BMC Bioinformatics . 2010,第SUPPLEMENTa5期

机译：在科学文献中应用文本挖掘的陷阱
2. Scientific Literature Information Extraction Using Text Mining Techniques for Human Health Risk Assessment of Electromagnetic Fields [J] . Lee Sang-Woo, Kwon Jung-Hyok, Lee Ben, Sensors and materials . 2020,第1期

机译：使用文本挖掘技术提取科学文献信息，用于电磁场对人体健康的风险评估
3. Basic Test Framework for the Evaluation of Text Line Segmentation and Text Parameter Extraction [J] . Darko Brodi#x00107, Dragan R. Milivojevi#x00107, Zoran Milivojevi#x00107 Sensors . 2010,第5期

机译：评估文本行分割和文本参数提取的基本测试框架
4. Semantically Enriched Literature Search Combining Text Mining, QSPR and Ontologies in Scientific Workflows [C] . Magnus Palmblad IEEE International Conference on e-Science . 2018

机译：在语义上丰富的文献搜索结合文本挖掘，QSPR和科学工作流程
5. Text and network mining for literature-based scientific discovery in biomedicine [D] . Ozgur, Arzucan 2010

机译：文本和网络挖掘在生物医学中基于文献的科学发现
6. Pitfalls in applying text mining to scientific literature [O] . Jean-Marc Neefs 2010

机译：在科学文献中应用文本挖掘的陷阱
7. Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain [O] . Santos André Fernandes dos, Nogueira R., Lourenço Anália 2012

机译：将文本挖掘框架应用于从生物技术领域的科学文献中提取数值参数

Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅