首页> 外文期刊>Bioinformatics >Gene symbol disambiguation using knowledge-based profiles
【24h】

Gene symbol disambiguation using knowledge-based profiles

机译:使用基于知识的配置文件消除基因符号歧义

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: The ambiguity of biomedical entities, particularly of gene symbols, is a big challenge for text-mining systems in the biomedical domain. Existing knowledge sources, such as Entrez Gene and the MEDLINE database, contain information concerning the characteristics of a particular gene that could be used to disambiguate gene symbols. Results: For each gene, we create a profile with different types of information automatically extracted from related MEDLINE abstracts and readily available annotated knowledge sources. We apply the gene profiles to the disambiguation task via an information retrieval method, which ranks the similarity scores between the context where the ambiguous gene is mentioned, and candidate gene profiles. The gene profile with the highest similarity score is then chosen as the correct sense. We evaluated the method on three automatically generated testing sets of mouse, fly and yeast organisms, respectively. The method achieved the highest precision of 93.9% for the mouse, 77.8% for the fly and 89.5% for the yeast.
机译:动机:生物医学实体,尤其是基因符号的歧义,对生物医学领域的文本挖掘系统是一个巨大的挑战。现有的知识源,例如Entrez Gene和MEDLINE数据库,包含有关可用于消除基因符号歧义的特定基因的特征的信息。结果:对于每个基因,我们创建一个配置文件,其中包含从相关MEDLINE摘要和易于获得的带注释的知识源中自动提取的不同类型的信息。我们通过信息检索方法将基因概况应用到消歧任务,该方法对提及歧义基因的环境与候选基因概况之间的相似性得分进行排名。然后选择具有最高相似性得分的基因图谱作为正确的意义。我们分别在三个自动生成的小鼠,果蝇和酵母菌测试集上评估了该方法。该方法对小鼠的准确度最高,为93.9%,对果蝇的准确度为77.8%,对酵母的准确度为89.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号