首页> 外文会议>IEEE International Conference on Data Engineering >FlashSchema: Achieving High Quality XML Schemas with Powerful Inference Algorithms and Large-scale Schema Data
【24h】

FlashSchema: Achieving High Quality XML Schemas with Powerful Inference Algorithms and Large-scale Schema Data

机译:FlashSchema:使用强大的推理算法和大规模模式数据实现高质量的XML模式

获取原文

摘要

Getting high quality XML schemas to avoid or reduce application risks is an important problem in practice, for which some important aspects have yet to be addressed satisfactorily in existing work. In this paper, we propose a tool FlashSchema for high quality XML schema design, which supports both one-pass and interactive schema design and schema recommendation. To the best of our knowledge, no other existing tools support interactive schema design and schema recommendation. One salient feature of our work is the design of algorithms to infer k-occurrence interleaving regular expressions, which are not only more powerful in model capacity, but also more efficient. Additionally, such algorithms form the basis of our interactive schema design. The other feature is that, starting from large-scale schema data that we have harvested from the Web, we devise a new solution for type inference, as well as propose schema recommendation for schema design. Finally, we conduct a series of experiments on two XML datasets, comparing with 9 state-of-the-art algorithms and open-source tools in terms of running time, preciseness, and conciseness. Experimental results show that our work achieves the highest level of preciseness and conciseness within only a few seconds. Experimental results and examples also demonstrate the effectiveness of our type inference and schema recommendation methods.
机译:在实践中,获得高质量的XML模式来避免或减少应用程序风险是一个重要的问题,在现有工作中,尚需令人满意地解决一些重要方面。在本文中,我们提出了用于高质量XML模式设计的工具FlashSchema,该工具同时支持一次通过和交互式模式设计以及模式推荐。据我们所知,没有其他现有工具支持交互式模式设计和模式推荐。我们工作的一个显着特征是设计用于推断k出现交织正则表达式的算法,该算法不仅模型功能更强大,而且效率更高。此外,此类算法构成了我们的交互模式设计的基础。另一个功能是,从我们从Web上收集到的大规模模式数据开始,我们为类型推断设计了一种新的解决方案,并为模式设计提出了模式建议。最后,我们在两个XML数据集上进行了一系列实验,在运行时间,准确性和简洁性方面与9种最新算法和开源工具进行了比较。实验结果表明,我们的工作仅在几秒钟内即可达到最高的精确度和简洁性。实验结果和示例也证明了我们的类型推断和模式推荐方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号