...
首页> 外文期刊>Frontiers in Genetics >Biomolecular Relationships Discovered from Biological Labyrinth and Lost in Ocean of Literature: Community Efforts Can Rescue Until Automated Artificial Intelligence Takes Over
【24h】

Biomolecular Relationships Discovered from Biological Labyrinth and Lost in Ocean of Literature: Community Efforts Can Rescue Until Automated Artificial Intelligence Takes Over

机译:从生物迷宫中发现并在文学海洋中迷失的生物分子关系:社区努力可以挽救,直到自动人工智能接管为止

获取原文
           

摘要

Many brilliant minds are at work to decipher the biological labyrinth and as a result immense amount of information about biological entities and their relationships is getting accumulated in the form of published literature (Hunter and Cohen, 2006 ). To cater the needs of a researcher, many tools are designed to perform tasks of Named Entity Recognition (NER), Information Retrieval (IR), and Information Extraction (IE) viz. A Combined Clinical Concept Annotator (Kang et al., 2012 ), BANNER (Leaman and Gonzalez, 2008 ), Biblio-MetReS (Usie et al., 2014 ), BioTextQuest+ (Papanikolaou et al., 2014 ), BIOSMILE Web Search (Dai et al., 2008 ), E3Miner (Lee et al., 2008 ), EBIMed (Rebholz-Schuhmann et al., 2007 ), eFIP (Arighi et al., 2011 ), FACTA+ (Tsuruoka et al., 2008 ), GNSuite ~(1) , iHOP (Hoffmann and Valencia, 2004 ), MyMiner (Salgado et al., 2012 ), RLIMS-P(Hu et al., 2005 ), Anni (Jelier et al., 2008 ), CoPub (Frijters et al., 2008 ), MedScan (Novichkova et al., 2003 ), PPInterFinder (Raja et al., 2012 ), pGenN (Ding et al., 2015 ), SciMiner (Hur et al., 2009 ), BIGNER (Li et al., 2009 ), hybrid named entity tagger (Raja et al., 2014 ), and more such tools can be obtained from BIONLP resource ~(2) and in detail analysis of many NLP tools is given by Krallinger et al. ( 2008 ) and Fleuren and Alkema ( 2015 ). Table 1 gives an informational and statistical insight into some of these literature mining tools, shedding light on their efficiency translated by statistical parameters viz. F-score, recall, and precision. Many tools are domain specific like kinase family specific but still calls for human intervention for exactitude and thus limit their usage. Moreover, the data output formats are sometimes too vague as name highlighting; to be put to use for bigger literature searches. Table 1 Informational (viz. data used, parameters for evaluation and working platform) and statistical (viz. f -value, recall and precision) insights for a few literature mining tools with their brief description and links to the tools' home page . Tool Event/Data used Parameters Platform F -value (%) Recall (%) Precision (%) Link Description A Combined Clinical Concept Annotator (Kang et al., 2012 ) i2b2 challenge Concept exact match task Web ~(**) 82.1 81.2 83.3 http://www.biosemantics.org/ACCCA_WEB Concept annotation system for clinical records Banner (Leaman and Gonzalez, 2008 ) BioCreative 2 GM task NER Desktop 81.96 79.06 85.09 http://banner.sourceforge.net Named entity recognition system, primarily intended for biomedical text Biblio-MetReS (Usie et al., 2014 ) Literature Databases and Journals Biological entities and relationships Desktop 37 27 ~(*) 58 http://metres.udl.cat/ To reconstruct networks from an always up to date set of scientific documents BIOSMILE Web Search (Dai et al., 2008 ) BioCreAtIvE II GM tagging task and IAS task NER and PPI article classifier Web ~(**) 85.76 89.12 82.59 http://bws.iis.sinica.edu.tw/BWS/ Analyze articles for selected biomedical verbs and lists abstracts along with snippets by order of relevancy to protein–protein interaction E3Miner (Lee et al., 2008 ) 100 random abstracts E3 related data Web 8 ~(*) 74 97 http://e3miner.biopathway.org/e3miner.html Extracts novel E3 discoveries and important findings related to specific E3s from the literature RLIMS-P (Jelier et al., 2008 ) BioCreative IV (BioCreative IAT) Kinase, substrate and site Web 92 96 ~(*) 88 http://research.bioinformatics.udel.edu/rlimsp/ Rule-based text-mining program designed to extract protein phosphorylation information on protein kinase, substrate and phosphorylation sites from biomedical literature Anni 2.0 (Frijters et al., 2008 ) Micro-array data and multiple publications Associations between biological entities Web ~(**) 75.5 ~(*) 76 75 http://biosemantics.org/anni/ Ontology-based interface to MEDLINE and retrieves documents and associations for several classes of biomedical concepts, including genes, drugs and diseases PPInterFinder (Ding et al., 2015 ) BioCreative workshop 2012 NER, IR Web ~(**) 78.07 70.58 87.33 http://www.biominingbu.org/ppinterfinder/ Extracts human PPIs from biomedical literature using relation keyword co-occurrences with protein names to extract information on PPIs from MEDLINE abstracts pGenN (Hur et al., 2009 ) 104 plant relevant abstracts NER Web 88.9 87.2 90.9 http://biotm.cis.udel.edu/gn/ A gene normalization tool for plant genes and proteins in scientific literature SciMiner (Li et al., 2009 ) BioCreAtIvE II NER, IR Desktop/Web 75.8 87.1 71.3 http://141.214.81.219/SciMiner/ Identifies genes and proteins using a context specific analysis of MEDLINE abstracts and full texts BIGNER (Raja et al., 2014 ) BioCreative 2 GM NER Web ~(**) 89.05 87.63 90.52 http://202.118.75.18:8080/bioner To locate gene/protein names in biomedical literature * These values were self calculated from the given values . ** Out of order web-interfaces . i2b2, Informatics for Integrating Biology and the Bedside; GM, Ge
机译:许多聪明的头脑正在努力破译生物迷宫,结果以出版的文献形式积累了有关生物实体及其相互关系的大量信息(Hunter和Cohen,2006年)。为了满足研究人员的需求,设计了许多工具来执行命名实体识别(NER),信息检索(IR)和信息提取(IE)的任务。联合临床概念注释器(Kang等,2012),BANNER(Leaman和Gonzalez,2008),Biblio-MetReS(Usie等,2014),BioTextQuest +(Papanikolaou等,2014),BIOSMILE Web Search(Dai等人,2008),E3Miner(Lee等人,2008),EBIMed(Rebholz-Schuhmann等人,2007),eFIP(Arighi等人,2011),FACTA +(Tsuruoka等人,2008),GNSuite 〜(1),iHOP(Hoffmann和Valencia,2004),MyMiner(Salgado等人,2012),RLIMS-P(Hu等人,2005),Anni(Jelier等人,2008),CoPub(Frijters等人)等(2008),MedScan(Novichkova等,2003),PPInterFinder(Raja等,2012),pGenN(Ding等,2015),SciMiner(Hur等,2009),BIGNER(Li等)。 (2009年),混合命名实体标签器(Raja等人,2014年),以及更多此类工具可以从BIONLP资源〜(2)中获得,而Krallinger等人对许多NLP工具进行了详细分析。 (2008)和Fleuren and Alkema(2015)。表1给出了对其中一些文献挖掘工具的信息和统计见解,阐明了由统计参数转换的效率。 F得分,召回率和精度。许多工具是特定域的,例如激酶家族特定的,但仍然需要人工干预以确保准确性,因此限制了它们的使用。而且,数据输出格式有时不如名称突出显示那样模糊;用于更大范围的文献搜索。表1几种文献挖掘工具的信息性(即使用的数据,评估和工作平台的参数)和统计性信息(即f值,召回率和精度)见解,其简要说明和指向工具首页的链接。工具使用的事件/数据参数平台F值(%)召回率(%)精度(%)链接说明A组合临床概念注释器(Kang等人,2012)i2b2挑战概念完全匹配任务Web〜(**)82.1 81.2 83.3 http://www.biosemantics.org/ACCCA_WEB临床记录的概念注释系统Banner(Leaman和Gonzalez,2008年)BioCreative 2 GM任务NER桌面81.96 79.06 85.09 http://banner.sourceforge.net命名实体识别系统,主要是实体旨在用于生物医学文本Biblio-MetReS(Usie等人,2014年)文献数据库和期刊生物实体和关系桌面37 27〜(*)58 http://metres.udl.cat/从始终更新的网络中重建网络一组科学文件BIOSMILE Web搜索(Dai等人,2008年)BioCreAtIvE II GM标签任务和IAS任务NER和PPI文章分类器Web〜(**)85.76 89.12 82.59 http://bws.iis.sinica.edu.tw / BWS /分析所选生物医学动词的文章,并按以下顺序列出摘要以及摘要与蛋白质相互作用的相关性E3Miner(Lee等,2008)100随机摘要E3相关数据Web 8〜(*)74 97 http://e3miner.biopathway.org/e3miner.html提取新的E3发现和与之相关的重要发现从文献RLIMS-P(Jelier et al。,2008)BioCreative IV(BioCreative IAT)激酶,底物和站点Web到96 96〜(*)88 http://research.bioinformatics.udel.edu/rlimsp/基于规则的文本挖掘程序,旨在从生物医学文献Anni 2.0中提取蛋白激酶,底物和磷酸化位点上的蛋白磷酸化信息(Frijters等,2008)微阵列数据和多种出版物生物实体之间的关联Web〜(** )75.5〜(*)76 75 http://biosemantics.org/anni/基于本体的MEDLINE界面,并检索了几类生物医学概念的文档和关联,包括基因,药物和疾病PPInterFinder(Ding等人,2015 )BioCreative Workshop 2012 NER,IR Web〜(**) 78.07 70.58 87.33 http://www.biominingbu.org/ppinterfinder/使用相关关键词共现与蛋白质名称从生物医学文献中提取人PPI,以从MEDLINE摘要pGenN中提取有关PPI的信息(Hur等,2009)104植物相关摘要NER Web 88.9 87.2 90.9 http://biotm.cis.udel.edu/gn/用于科学文献中植物基因和蛋白质的基因标准化工具SciMiner(Li等,2009)BioCreAtIvE II NER,IR Desktop / Web 75.8 87.1 71.3 http://141.214.81.219/SciMiner/使用MEDLINE摘要和全文的上下文特定分析来识别基因和蛋白质BIGNER(Raja等人,2014)BioCreative 2 GM NER Web〜(**)89.05 87.63 90.52 http ://202.118.75.18:8080 / bioner在生物医学文献中查找基因/蛋白质名称*这些值是根据给定值自行计算的。 **乱序的Web界面。 i2b2,整合生物学和床头的信息学;通用汽车

著录项

相似文献

  • 外文文献
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号