首页> 外文期刊>ACM Transactions on Information Systems >Preparing Heterogeneous XML for Full-Text Search
【24h】

Preparing Heterogeneous XML for Full-Text Search

机译:准备用于全文搜索的异构XML

获取原文
获取原文并翻译 | 示例

摘要

XML retrieval is facing new challenges when applied to heterogeneous XML documents, where next to nothing about the document structure can be taken for granted. We have developed solutions where some of the heterogeneity issues are addressed. Our fragment selection algorithm selectively divides a heterogeneous document collection into equi-sized fragments with full-text content. If the content is considered too data-oriented, it is not accepted. The algorithm needs no information about element names. In addition, three techniques for fragment expansion are presented, all of which yield a 13-17% average improvement in average precision. These techniques and algorithms are among the first steps in developing document-type-independent indexing methods for the full text in heterogeneous XML collections.
机译:当将XML检索应用于异构XML文档时,面临着新的挑战,其中关于文档结构的几乎所有内容都可以认为是理所当然的。我们已经开发出解决某些异质性问题的解决方案。我们的片段选择算法选择性地将异构文档集合分为具有全文内容的大小相等的片段。如果认为内容过于面向数据,则不接受。该算法不需要有关元素名称的信息。此外,提出了三种用于片段扩展的技术,所有这些技术平均平均精度提高了13-17%。这些技术和算法是为异构XML集合中的全文本开发与文档类型无关的索引方法的第一步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号