...
首页> 外文期刊>Nucleic acids research >PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites
【24h】

PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites

机译:PolySearch:基于网络的文本挖掘系统,用于提取人类疾病,基因,突变,药物和代谢物之间的关系

获取原文
   

获取外文期刊封面封底 >>

       

摘要

A particular challenge in biomedical text mining is to find ways of handling ‘comprehensive' or ‘associative' queries such as ‘Find all genes associated with breast cancer'. Given that many queries in genomics, proteomics or metabolomics involve these kind of comprehensive searches we believe that a web-based tool that could support these searches would be quite useful. In response to this need, we have developed the PolySearch web server. PolySearch supports 50 different classes of queries against nearly a dozen different types of text, scientific abstract or bioinformatic databases. The typical query supported by PolySearch is ‘Given X, find all Y's' where X or Y can be diseases, tissues, cell compartments, gene/protein names, SNPs, mutations, drugs and metabolites. PolySearch also exploits a variety of techniques in text mining and information retrieval to identify, highlight and rank informative abstracts, paragraphs or sentences. PolySearch's performance has been assessed in tasks such as gene synonym identification, protein–protein interaction identification and disease gene identification using a variety of manually assembled ‘gold standard' text corpuses. Its f-measure on these tasks is 88, 81 and 79%, respectively. These values are between 5 and 50% better than other published tools. The server is freely available at http://wishart.biology.ualberta.ca/polysearch
机译:生物医学文本挖掘中的一个特殊挑战是找到处理“综合”或“关联”查询的方法,例如“查找与乳腺癌相关的所有基因”。鉴于基因组学,蛋白质组学或代谢组学中的许多查询都涉及这类全面的搜索,因此我们认为可以支持这些搜索的基于Web的工具将非常有用。为了满足这种需求,我们开发了PolySearch Web服务器。 PolySearch可以对将近十二种不同类型的文本,科学摘要或生物信息数据库进行50多种不同的查询。 PolySearch支持的典型查询是“给出X,找到所有Y”,其中X或Y可以是疾病,组织,细胞区室,基因/蛋白质名称,SNP,突变,药物和代谢物。 PolySearch还利用文本挖掘和信息检索中的各种技术来识别,突出显示和排列信息摘要,段落或句子。 PolySearch的性能已通过多种人工组装的“金标准”文本语料库在诸如基因同义词识别,蛋白质-蛋白质相互作用识别和疾病基因识别等任务中得到了评估。针对这些任务的f测度分别为88%,81%和79%。这些值比其他已发布的工具好5%至50%。该服务器可从http://wishart.biology.ualberta.ca/polysearch免费获得

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号