首页> 外文OA文献 >Applying Biomedical Ontologies on Semantic Query Expansion
【2h】

Applying Biomedical Ontologies on Semantic Query Expansion

机译:生物医学本体在语义查询扩展中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

*1- Introduction* The interpretation of a question (or information need) depends, among other things, of a series of lexicalsemantic relations that complement and help the cognitive process of answering that information need. Despite this fact, currently used information retrieval mechanisms take few advantages of the semantic interpretation of users’ information needs (usually specified through keywords). In most of the cases, those mechanisms are based on keyword matching, and thus are excessively dependant on the query and document terms. There are several past results showing that, in general, information retrieval based on domain knowledge decreases the accuracy of keyword based search engines. We believe this approach deserves further discussion and experimentation, looking for more strong evidences that these negative results can really be generalized. Moreover, there are some questions left unanswered by previous work that our experiment is addressing: (_i_) Using a scientific ontology, with formal construction and maintenance processes, such as the OBO ontologies, would produce better results? (_ii_) Are there more efficient query expansion techniques using available domain knowledge? (_iii_) Is a scientific ontology complete enough to fulfill the information retrieval researchers’ needs, in general? *2- Semantic Query Expansion* To try to answer some of these questions, we run a query expansion experiment using the Gene Ontology (GO) as domain knowledge. As the document repository, we used an extraction of 10 years of PubMed publications (from 1994 to 2004), which contains approximately 4.6 Million documents. This dataset is a test collection used by the information retrieval community, called Genomic TREC. *3- Results* To evaluate our ontology-based semantic query expansion technique, we measured the effectiveness of the information retrieval mechanism with and without expansion. In a nutshell, the average result showed an increase of 28% on synonyms relations and a small decrease on other relations. Our results show a lot of consistence with past related work. In fact, if the expansion strategy does not selectively choose when and how to expand, only synonym relations are worth to be used. However, looking further, it is possible to find several opportunities to try other expansion strategies. For example, the problem with query expansion using generalization/specialization relationships is that, if it is always applied, the bad results are more frequent than the good ones. But, if the strategy is to be selective on when to use these relations for expansion, the increasing on accuracy can be outstanding. As shown by our experiment, there was a query with 98% increment on effectiveness. *4- Conclusion* We strongly believe that it is premature to assume that semantics-based query expansion is, in general, a recall-enhancing, precision-degrading technique. Our experiments suggest that by using scientific based ontologies (like OBO ontologies) with formal relations, it is possible to increase both recall and precision. Our group is currently revising this first experiment towards a better semantic query expansion strategy. *5- Acknowledgements* This work was partially funded by CAPES and CNPq research grants 311454/2006-2, 306889/2007-2 and 484713/2007-8. *References* _Fox E. Lexical relations enhancing effectiveness of information retrieval systems. SIGIR Forum, New York, v.15, n.3, p.5-3._ _Voorhees E. Query expansion using lexicalsemantic relations. In: ACM SIGIR conference on research and development in information retrieval, Proceedings, Dublin:17, p.61–69, 1994_
机译:* 1-简介*一个问题(或信息需求)的解释除其他外,取决于一系列词汇语义关系,这些关系补充并帮助了回答该信息需求的认知过程。尽管如此,当前使用的信息检索机制仍很少利用用户信息需求(通常通过关键字指定)的语义解释优势。在大多数情况下,这些机制基于关键字匹配,因此过度依赖查询和文档术语。过去的一些结果表明,一般而言,基于领域知识的信息检索会降低基于关键字的搜索引擎的准确性。我们认为,这种方法值得进一步讨论和试验,寻找更有力的证据证明这些负面结果确实可以推广。此外,我们的实验正在解决的以前的工作中还没有回答一些问题:(_i_)使用科学本体并进行正式的构建和维护过程,例如OBO本体,会产生更好的结果吗?(_ii_)是否有使用现有领域知识的更有效的查询扩展技术?(_iii_)一般来说,科学本体是否足够完整,可以满足研究人员对信息检索的需求?* 2-语义查询扩展*为了尝试回答其中一些问题,我们使用基因本体论(GO)作为领域知识来运行查询扩展实验。作为文档库,我们使用了10年PubMed出版物(从1994年到2004年)的摘要,其中包含大约460万个文档。此数据集是信息检索社区使用的测试集合,称为Genomic TREC。* 3-结果*为了评估基于本体的语义查询扩展技术,我们测量了有无扩展信息检索机制的有效性。简而言之,平均结果显示同义词关系增加了28%,其他关系减少了一点。我们的结果显示出与过去相关工作的一致性。实际上,如果扩展策略没有选择性地选择何时以及如何扩展,则仅值得使用同义词关系。但是,进一步看,有可能找到尝试其他扩展策略的机会。例如,使用泛化/专业化关系进行查询扩展的问题是,如果始终应用查询,则不良结果要比良好结果更常见。但是,如果要选择何时使用这些关系进行扩展的策略,则准确性的提高可能是杰出的。如我们的实验所示,有一个效率提高了98%的查询。*4。结论*我们强烈认为,现在一般认为基于语义的查询扩展是一种提高查全率,降低精度的技术还为时过早。我们的实验表明,通过使用具有正式关系的基于科学的本体(例如OBO本体),可以提高查全率和查准率。我们小组目前正在将第一个实验修订为一种更好的语义查询扩展策略。* 5-致谢*这项工作部分由CAPES和CNPq研究补助金311454 / 2006-2、306889 / 2007-2和484713 / 2007-8资助。*参考*_Fox E.增强信息检索系统有效性的词汇关系。纽约SIGIR论坛,v.15,n.3,p.5-3.__Voorhees E.使用词汇语义关系的查询扩展。在:ACM SIGIR信息检索研究与开发会议上,会议记录,都柏林:17,第61–69页,1994_

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号