首页> 外文会议>2011 Fifth IEEE International Conference on Semantic Computing >Generating Semantics for the Life Sciences via Text Analytics
【24h】

Generating Semantics for the Life Sciences via Text Analytics

机译:通过文本分析为生命科学生成语义

获取原文

摘要

The life sciences have a strong need for carefully curated, semantically rich fact repositories. Knowledge harvesting from unstructured textual sources is currently performed by highly skilled curators who manually feed semantics into such databases as a result of deep understanding of the documents chosen to populate such repositories. As this is a slow and costly process, we here advocate an automatic approach to the generation of database contents which is based on JREX, a high performance relation extraction system. As a real-life example, we target REGULONDB, the world's largest manually curated reference database for the transcriptional regulation network of E. coli. We investigate in our study the performance of automatic knowledge capture from various literature sources, such as PUBMED abstracts and associated full text articles. Our results show that we can, indeed, automatically re-create a considerable portion of the REGULONDB database by processing the relevant literature sources. Hence, this approach might help curators widen the knowledge acquisition bottleneck in this field.
机译:生命科学强烈需要精心策划,语义丰富的事实资料库。当前,非结构化文本源的知识收集是由高技能的策展人执行的,他们由于对选择填充此类存储库的文档的深入了解而将语义手动输入到此类数据库中。由于这是一个缓慢且昂贵的过程,因此我们在此提倡一种基于高性能关系提取系统JREX的自动生成数据库内容的方法。举一个真实的例子,我们针对REGULONDB,REGULONDB是世界上最大的大肠杆菌转录调控网络人工策划参考数据库。我们在研究中调查了来自各种文献来源(例如PUBMED摘要和相关的全文文章)的自动知识捕获的性能。我们的结果表明,实际上,通过处理相关文献资料,我们可以自动重新创建REGULONDB数据库的相当大的一部分。因此,这种方法可能有助于策展人拓宽该领域的知识获取瓶颈。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号