首页> 美国卫生研究院文献>PLoS Computational Biology >Getting More Out of Biomedical Documents with GATEs Full Lifecycle Open Source Text Analytics
【2h】

Getting More Out of Biomedical Documents with GATEs Full Lifecycle Open Source Text Analytics

机译:利用GATE的完整生命周期开源文本分析功能从生物医学文档中获得更多收益

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most widely used systems of its type with yearly download rates of tens of thousands and many active users in both academic and industrial contexts. In this paper we report three examples of GATE-based systems operating in the life sciences and in medicine. First, in genome-wide association studies which have contributed to discovery of a head and neck cancer mutation association. Second, medical records analysis which has significantly increased the statistical power of treatment/outcome models in the UK's largest psychiatric patient cohort. Third, richer constructs in drug-related searching. We also explore the ways in which the GATE family supports the various stages of the lifecycle present in our examples. We conclude that the deployment of text mining for document abstraction or rich search and navigation is best thought of as a process, and that with the right computational tools and data collection strategies this process can be made defined and repeatable. The GATE research programme is now 20 years old and has grown from its roots as a specialist development tool for text processing to become a rather comprehensive ecosystem, bringing together software developers, language engineers and research staff from diverse fields. GATE now has a strong claim to cover a uniquely wide range of the lifecycle of text analysis systems. It forms a focal point for the integration and reuse of advances that have been made by many people (the majority outside of the authors' own group) who work in text processing for biomedicine and other areas. GATE is available online <1> under GNU open source licences and runs on all major operating systems. Support is available from an active user and developer community and also on a commercial basis.
机译:该软件文章介绍了GATE系列开源文本分析工具和过程。 GATE是同类应用中使用最广泛的系统之一,每年下载数以万计,在学术和工业领域中都有许多活跃用户。在本文中,我们报告了在生命科学和医学领域中运行的基于GATE的系统的三个示例。首先,在全基因组关联研究中,这些研究有助于发现头颈癌突变关联。其次,病历分析大大提高了英国最大的精神病患者队列中治疗/结果模型的统计能力。第三,在与毒品有关的搜索中有更丰富的结构。我们还将探讨示例中GATE系列支持生命周期各个阶段的方式。我们得出结论,最好将文本挖掘的部署用于文档抽象或丰富的搜索和导航,这是一个过程,并且可以通过使用正确的计算工具和数据收集策略来使此过程变得可定义和可重复。 GATE研究计划已有20年历史了,从最初的专业发展工具到文本处理,已经发展成为一个相当全面的生态系统,汇集了来自各个领域的软件开发人员,语言工程师和研究人员。 GATE现在强烈要求涵盖文本分析系统生命周期的独特范围。它为整合和重用许多生物医学和其他领域的文本处理工作的人们(大多数人不在作者自己的团队中)所形成的焦点。 GATE在GNU开源许可下可在线<1>获得,并且可以在所有主要操作系统上运行。可以从活跃的用户和开发人员社区获得支持,也可以在商业上获得支持。

著录项

相似文献

  • 外文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号