首页> 美国卫生研究院文献>PLoS Clinical Trials >Extracting microRNA-gene relations from biomedical literature using distant supervision
【2h】

Extracting microRNA-gene relations from biomedical literature using distant supervision

机译:使用远程监督从生物医学文献中提取microRNA与基因的关系

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Many biomedical relation extraction approaches are based on supervised machine learning, requiring an annotated corpus. Distant supervision aims at training a classifier by combining a knowledge base with a corpus, reducing the amount of manual effort necessary. This is particularly useful for biomedicine because many databases and ontologies have been made available for many biological processes, while the availability of annotated corpora is still limited. We studied the extraction of microRNA-gene relations from text. MicroRNA regulation is an important biological process due to its close association with human diseases. The proposed method, IBRel, is based on distantly supervised multi-instance learning. We evaluated IBRel on three datasets, and the results were compared with a co-occurrence approach as well as a supervised machine learning algorithm. While supervised learning outperformed on two of those datasets, IBRel obtained an F-score 28.3 percentage points higher on the dataset for which there was no training set developed specifically. To demonstrate the applicability of IBRel, we used it to extract 27 miRNA-gene relations from recently published papers about cystic fibrosis. Our results demonstrate that our method can be successfully used to extract relations from literature about a biological process without an annotated corpus. The source code and data used in this study are available at .
机译:许多生物医学关系提取方法都基于有监督的机器学习,需要带注释的语料库。远程监管旨在通过将知识库与语料库相结合来训练分类器,从而减少所需的人工量。这对于生物医学特别有用,因为已经为许多生物过程提供了许多数据库和本体,而带注释语料库的可用性仍然受到限制。我们研究了从文本中提取microRNA与基因的关系。由于其与人类疾病的密切联系,MicroRNA调控是重要的生物学过程。所提出的方法IBRel基于远程监督的多实例学习。我们在三个数据集上对IBRel进行了评估,并将结果与​​共现方法以及监督式机器学习算法进行了比较。尽管有监督的学习在其中两个数据集上表现出色,但IBRel在没有专门开发训练集的数据集上获得了更高的F分数28.3个百分点。为了证明IBRel的适用性,我们使用它从最近发表的有关囊性纤维化的论文中提取了27种miRNA与基因的关系。我们的结果表明,我们的方法可以成功地用于从没有注释语料的生物学过程的文献中提取关系。本研究中使用的源代码和数据可在上找到。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号