Annotations play an increasingly crucial role in scientific exploration and discovery, as the amount of data and the level of collaboration among scientists increases. There are many systems today focusing on annotation management, querying, and propagation. Although all such systems are implemented to take user input (i.e., the annotations themselves), very few systems are user-driven, taking into account user preferences on how annotations should be propagated and applied over data. In this thesis, we propose to treat annotations as first-class citizens for scientific data by introducing a user-driven, view-based annotation framework. Under this framework, we try to resolve two critical questions: Firstly, how do we support annotations that are scalable both from a system point of view and also from a user point of view? Secondly, how do we support annotation queries both from an annotator point of view and a user point of view, in an efficient and accurate way?ududTo address these challenges, we propose the VIew-base annotation Propagation (ViP) framework to empower users to express their preferences over the time semantics of annotations and over the network semantics of annotations, and define three query types for annotations. To efficiently support such novel functionality, ViP utilizes database views and introduces new annotation caching techniques. The use of views also brings a more compact representation of annotations, making our system easier to scale. Through an extensive experimental study on a real system (with both synthetic and real data), we show that the ViP framework can seamlessly introduce user-driven annotation propagation semantics while at the same time significantly improving the performance (in terms of query execution time) over the current state of the art.
展开▼
机译:随着数据量的增加和科学家之间的协作水平的提高,注释在科学探索和发现中扮演着越来越重要的角色。当今有许多系统专注于注释管理,查询和传播。尽管实现了所有这样的系统以接收用户输入(即,注释本身),但是很少有系统是用户驱动的,考虑了关于应如何传播注释并将其应用于数据的用户偏好。在本文中,我们建议通过引入用户驱动的基于视图的注释框架将注释视为科学数据的一等公民。在此框架下,我们尝试解决两个关键问题:首先,我们如何支持既可以从系统角度又可以从用户角度扩展的注释?其次,我们如何以高效,准确的方式从注释者的角度和用户的角度支持注释查询? ud ud为了解决这些挑战,我们提出了基于VIew的注释传播(ViP)框架,使用户能够通过注释的时间语义和注释的网络语义表达自己的偏好,并为注释定义三种查询类型。为了有效地支持这种新颖的功能,ViP利用数据库视图并引入了新的注释缓存技术。视图的使用还带来了更紧凑的注释表示,使我们的系统更易于缩放。通过在真实系统(包含合成数据和真实数据)上进行的广泛实验研究,我们表明ViP框架可以无缝引入用户驱动的注释传播语义,同时显着提高性能(在查询执行时间方面)超越目前的技术水平。
展开▼