Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review

Yufan Guo; Ilona Silins; Ulla Stenius; Anna Korhonen

首页> 外文期刊>Bioinformatics >Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review

【24h】

Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review

机译：基于主动学习的全科学文章的信息结构分析和生物医学文献综述的两个应用

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Motivation: Techniques that are capable of automatically analyzing the information structure of scientific articles could be highly useful for improving information access to biomedical literature. However, most existing approaches rely on supervisedmachine learning (ML) and substantial labeled data that are expensive to develop and apply to different sub-fields of biomedicine. Recent research shows that minimal supervision is sufficient for fairly accurate information structure analysis of biomedical abstracts. However, is it realistic for full articles given their high linguistic and informational complexity? We introduce and release a novel corpus of 50 biomedical articles annotated according to the Argumentative Zoning (AZ) scheme, and investigate active learning with one of the most widely used ML models— Support Vector Machines (SVM)—on this corpus. Additionally, we introduce two novel applications that use AZ to support real-life literature review in biomedicine via question answering andsummarization. Results: We show that active learning with SVM trained on 500 labeled sentences (6% of the corpus) performs surprisingly well with the accuracy of 82%, just 2% lower than fully supervised learning. In our question answering task, biomedical researchers find relevant information significantly faster from AZ-annotated than unannotated articles. In the summarization task, sentences extracted from particular zones are significantly more similar to gold standard summaries than those extractedfrom particular sections of full articles. These results demonstrate that active learning of full articles’ information structure is indeed realistic and the accuracy is high enough to support real-life literature review in biomedicine. Availability: The annotated corpus, our AZ classifier and the two novel applications are available at http://www.cl.cam.ac.uk/~yg244/ 12bioinfo.html.

机译：动机：能够自动分析科学文章信息结构的技术对于改善对生物医学文献的信息获取可能非常有用。但是，大多数现有方法依赖于监督式机器学习（ML）和大量标记数据，这些数据开发成本很高，并且难以应用于生物医学的不同子领域。最近的研究表明，最少的监督就足以对生物医学摘要进行相当准确的信息结构分析。但是，鉴于其全文在语言和信息方面的高度复杂性，对于整篇文章来说是否现实？我们引入并发布了一个新的语料库，该语料库包含50篇根据议事区划（AZ）方案注释的生物医学文章，并使用该语料库上使用最广泛的ML模型之一（支持向量机（SVM））研究主动学习。此外，我们介绍了两种新颖的应用程序，它们通过问题解答和摘要使用AZ支持生物医学中的现实生活文献综述。结果：我们显示，通过在500个带标签的句子（占主体的6％）上训练的SVM进行的主动学习具有令人惊讶的出色表现，其准确率为82％，仅比完全监督学习低2％。在我们的问答任务中，生物医学研究人员从带有AZ注释的文章中发现相关信息的速度明显快于无注释的文章。在摘要任务中，从特定区域中提取的句子比从完整文章的特定部分中提取的句子与金标准摘要更为相似。这些结果表明，主动学习全文的信息结构确实是现实的，其准确性足以支持生物医学中的现实生活中的文献综述。可用性：带注释的语料库，我们的AZ分类器和两个新颖的应用程序可从http://www.cl.cam.ac.uk/~yg244/ 12bioinfo.html获得。

著录项

来源
《Bioinformatics》 |2013年第11期|共8页
作者
Yufan Guo; Ilona Silins; Ulla Stenius; Anna Korhonen;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类生物工程学（生物技术）;
关键词

相似文献

外文文献
中文文献
专利

1. Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review [J] . Yufan Guo, Ilona Silins, Ulla Stenius, Bioinformatics . 2013,第11期

机译：基于主动学习的全科学文章的信息结构分析和生物医学文献综述的两个应用
2. Detecting Community Structure in Scientific Network -A Citation Network Analysis for Scientific Review Articles [J] . Sheeba J.I., Pradeep Devaneyan S., Ram Vignesh B., International Journal of Applied Research on Information Technology and Computing . 2020,第2期

机译：检测科学网络中的社区结构-A引用网络分析对科学评论文章
3. Analysis of mobile applications reporting on nutritional recipes: a review of the scientific literature [J] . Jose Huamaní-Cahuana, Michael Cabanillas-Carbonell E3S Web of Conferences . 2021,第a期

机译：营养食谱报告的移动应用分析 - 科学文学述评
4. Application Software Analysis for Children with Autism Spectrum Disorder: a Review of the Scientific Literature from 2005 - 2020 [C] . Anny Cabanillas-Tello, Michael Cabanillas-Carbonell E-Health and Bioengineering Conference . 2020

机译：自闭症谱系障碍儿童应用软件分析 - 2005 - 2020年科学文学综述
5. A corpus-based investigation of scientific research articles: Linking move analysis with multidimensional analysis. [D] . Kanoksilapatham, Budsaba. 2003

机译：基于语料库的科学研究文章调查：将运动分析与多维分析联系起来。
6. Design of Additively Manufactured Structures for Biomedical Applications: A Review of the Additive Manufacturing Processes Applied to the Biomedical Sector [O] . Flaviana Calignano, Manuela Galati, Luca Iuliano, 2019

机译：用于生物医学应用的增材制造结构的设计：对应用于生物医学领域的增材制造工艺的回顾
7. Literature review of research on personal selling : A content analysis of scientific articles in 1997-2006 [O] . Lindborg Janne 2008

机译：个人销售研究文献综述：1997-2006年科学论文内容分析

Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review

摘要

著录项

相似文献

相关主题

期刊订阅