首页> 外文会议>International Conference on Web Information Systems Engineering >ABLA: An Algorithm for Repairing Structure-Based Locators Through Attribute Annotations
【24h】

ABLA: An Algorithm for Repairing Structure-Based Locators Through Attribute Annotations

机译:abla:通过属性注释修复基于结构的定位器的算法

获取原文

摘要

The growth of the web has been unstoppable in the last decade, which leads to an increasing demand for extracting information from it. Apart from the need to extract information, this growth also has brought the necessity to adapt web pages to user requirements, create annotations or test web applications. Due to the evolution of web pages, the complexity of the implementation of these techniques has increased. Being able to test, annotate, adapt and extract information from web pages correctly and efficiently has become a primary task. In order to perform all these tasks, it is mandatory to have the best mechanisms to effectively and unequivocally locate the desired elements throughout the web page life cycle, especially when a web page evolves. Different mechanisms are used to find web nodes. These mechanisms, called locators, are prone to fail over time owing to changes on websites. Many authors improve life expectancy of locators developing algorithms that use different types of locators. Some others have created algorithms that regenerate locators by saving extra information from the previous structure of the website. These algorithms extend the useful life of locators but their computational and storage cost is much higher. To avoid these problems, we have designed an algorithm that employs an attribute system embedded in the HTML code. The algorithm is able to regenerate the locators based on these attributes every time a single change takes place in a given element attribute. The evaluation of the proposal shows a much lower computational cost than in previous works.
机译:在过去的十年中,网络的增长是不可阻挡的,这导致对从中提取信息的需求不断增加。除了需要提取信息的情况下,这种增长还需要使网页调整到用户需求,创建注释或测试Web应用程序的必要性。由于网页的演变,这些技术的实现的复杂性增加了。能够正确且有效地从网页上测试,涂布,调整和提取信息已成为主要任务。为了执行这些任务,它是强制性的有有效的最佳机制,明确地定位在整个网页的生命周期所需的元素,尤其是当一个网页演变。不同的机制用于查找网节点。由于网站的变化,这些机制被称为定位器,随着时间的推移而易于失败。许多作者改善了使用不同类型定位器的算法的定位器的预期寿命。一些其他人通过将额外信息从网站的先前结构保存额外信息,创建了重新生成定位器的算法。这些算法延长了定位器的使用寿命,但它们的计算和储存成本要高得多。为避免这些问题,我们设计了一种使用嵌入在HTML代码中的属性系统的算法。算法能够在每次在给定元素属性中发生单个更改时基于这些属性重新生成定位器。对提案的评估显示了比以前的作品更低的计算成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号