首页> 外文期刊>Information retrieval >Semantic Similarity Search on Semistructured Data with the XXL Search Engine
【24h】

Semantic Similarity Search on Semistructured Data with the XXL Search Engine

机译:使用XXL搜索引擎对半结构化数据进行语义相似性搜索

获取原文
获取原文并翻译 | 示例
           

摘要

Query languages for XML such as XPath or XQuery support Boolean retrieval: a query result is a (possibly restructured) subset of XML elements or entire documents that satisfy the search conditions of the query. This search paradigm works for highly schematic XML data collections such as electronic catalogs. However, for searching information in open environments such as the Web or intranets of large corporations, ranked retrieval is more appropriate: a query result is a ranked list of XML elements in descending order of (estimated) relevance. Web search engines, which are based on the ranked retrieval paradigm, do, however, not consider the additional information and rich annotations provided by the structure of XML documents and their element names. This article presents the XXL search engine that supports relevance ranking on XML data. XXL is particularly geared for path queries with wildcards that can span multiple XML collections and contain both exact-match as well as semantic-similarity search conditions. In addition, ontological information and suitable index structures are used to improve the search efficiency and effectiveness. XXL is fully implemented as a suite of Java classes and servlets. Experiments in the context of the INEX benchmark demonstrate the efficiency of the XXL search engine and underline its effectiveness for ranked retrieval,
机译:XML的查询语言(例如XPath或XQuery)支持布尔检索:查询结果是XML元素或满足查询搜索条件的整个文档的(可能是重组的)子集。该搜索范例适用于高度示意性的XML数据集合,例如电子目录。但是,对于在开放式环境(例如Web或大型公司的Intranet)中搜索信息,排名检索更为合适:查询结果是按(估计)相关性降序排列的XML元素的排名列表。但是,基于排名检索范式的Web搜索引擎并没有考虑XML文档的结构及其元素名称提供的附加信息和丰富的注释。本文介绍了XXL搜索引擎,该引擎支持对XML数据的相关性排名。 XXL特别适用于带有通配符的路径查询,通配符可以跨越多个XML集合,并且包含精确匹配和语义相似性搜索条件。另外,本体信息和合适的索引结构被用来提高搜索效率和有效性。 XXL作为Java类和Servlet套件完全实现。以INEX基准测试为背景的实验证明了XXL搜索引擎的效率,并强调了其对排名检索的有效性,

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号