首页> 外文会议>International Conference on Web Information Systems Engineering >Web Page Template and Data Separation for Better Maintainability
【24h】

Web Page Template and Data Separation for Better Maintainability

机译:网页模板和数据分离以获得更好的可维护性

获取原文

摘要

Separating a web page into template code and data records populated into the template is an important problem. This problem has a wide range of applications in web page compression and information extraction. We study this problem with the aim to separate a web page into easily maintainable template code and data records. We show that this problem is NP-hard. We then propose a heuristic algorithm to solve the problem. The main idea of our algorithm is to parse a web page into a tree and then to process it recursively in a bottom-up manner with three steps: splitting, folding, and alignment. We perform experiments on real datasets to evaluate the performance of our proposed algorithms in maximizing the maintainability of the template code produced. The experimental results show that our proposed algorithms outperform the baseline algorithms by 25% in the maintainability measure.
机译:将网页分成模板代码和填充到模板中的数据记录是一个重要问题。此问题在网页压缩和信息提取中具有广泛的应用。我们研究这个问题,目的是将网页分开到易于维护的模板代码和数据记录中。我们展示这个问题是NP - 硬。然后我们提出了一种启发式算法来解决问题。我们的算法的主要思想是将网页解析为树,然后用三个步骤以自下而上的方式递归地处理它:拆分,折叠和对齐。我们对实际数据集进行实验,以评估我们提出的算法在最大化所产生的模板代码的可维护性方面的性能。实验结果表明,我们所提出的算法优于基线算法在可维护性测量中将基线算法达到25%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号